Hadoop Commands
hadoop command [genericOptions] [commandOptions]
hadoop fs
Usage: java FsShell
[-ls <path>]
[-lsr <path>]
[-df [<path>]]
[-du <path>]
[-dus <path>]
[-count[-q] <path>]
[-mv <src> <dst>]
[-cp <src> <dst>]
[-rm [-skipTrash] <path>]
[-rmr [-skipTrash] <path>]
[-expunge]
[-put <localsrc> ... <dst>]
[-copyFromLocal <localsrc> ... <dst>]
[-moveFromLocal <localsrc> ... <dst>]
[-get [-ignoreCrc] [-crc] <src> <localdst>]
[-getmerge <src> <localdst> [addnl]]
[-cat <src>]
[-text <src>]
[-copyToLocal [-ignoreCrc] [-crc] <src> <localdst>]
[-moveToLocal [-crc] <src> <localdst>]
[-mkdir <path>]
[-setrep [-R] [-w] <rep> <path/file>]
[-touchz <path>]
[-test -[ezd] <path>]
[-stat [format] <path>]
[-tail [-f] <file>]
[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
[-chown [-R] [OWNER][:[GROUP]] PATH...]
[-chgrp [-R] GROUP PATH...]
[-help [cmd]]
hadoop fs -ls /
hadoop fs -ls /test1/
hadoop fs -cat /test1/foo.txt
hadoop fs -rm /test1/foo.txt
hadoop fs -mkdir /test6/
hadoop fs -put foo.txt /test6/
hadoop fs -put /etc/note.txt /test2/note_fs.txt
hadoop fs -get /user/satya/passwd ./
hadoop fs -setrep 5 -R /user/satya/tmp/
hadoop fsck /user/satya/tmp -files -blocks -locations
hadoop fs -chmod 1777 /tmp
hadoop fs -touch /user/satya/test/foo
hadoop fs -rmr /user/satya/test/foo
hadoop fs -touchz /user/satya/test/bar
hadoop fs -count -q /user/satya
hadoop -execute start-all.sh
hadoop job -list
hadoop job -kill jobID
hadoop job -list-attempt-ids jobID taskType taskState
hadoop job -kill-task taskAttemptId
hadoop namenode -format
hadoop jar <jar_file> wordcount <output_file>
hadoop jar /opt/hadoop/hadoop-examples-1.0.4.jar wordcount /out/wc_output
hadoop dfsadmin -report
hadoop dfsadmin -setSpaceQuota 10737418240 /user/esammer
hadoop dfsadmin -refreshNodes
hadoop dfsadmin -upgradeProgress status
hadoop dfsadmin -finalizeUpgrade
hadoop fsck
Usage: DFSck <path> [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]
<path> -- start checking from this path
-move -- move corrupted files to /lost+found
-delete -- delete corrupted files
-files -- print out files being checked
-openforwrite -- print out files opened for write
-blocks -- print out block report
-locations -- print out locations for every block
-racks -- print out network topology for data-node locations
By default fsck ignores files opened for write, use -openforwrite to report such files. They are usually tagged CORRUPT or HEALTHY depending on their block allocation status.
hadoop fsck / -files -blocks -locations
hadoop fsck /user/satya -files -blocks -locations
hadoop distcp -- Distributed Copy (distcp)
distcp [OPTIONS] <srcurl>* <desturl>
OPTIONS:
-p[rbugp] Preserve status
r: replication number
b: block size
u: user
g: group
p: permission
-p alone is equivalent to -prbugp
-i Ignore failures
-log <logdir> Write logs to <logdir>
-m <num_maps> Maximum number of simultaneous copies
-overwrite Overwrite destination
-update Overwrite if src size different from dst size
-skipcrccheck Do not use CRC check to determine if src is different from dest. Relevant only if -update is specified
-f <urilist_uri> Use list at <urilist_uri> as src list
-filelimit <n> Limit the total number of files to be <= n
-sizelimit <n> Limit the total size to be <= n bytes
-delete Delete the files existing in the dst but not in src
-mapredSslConf <f> Filename of SSL configuration for mapper task
NOTE 1: if -overwrite or -update are set, each source URI is interpreted as an isomorphic update to an existing directory.
For example:
hadoop distcp -p -update "hdfs://A:8020/user/foo/bar" "hdfs://B:8020/user/foo/baz"
would update all descendants of 'baz' also in 'bar'; it would *not* update /user/foo/baz/bar
NOTE 2: The parameter <n> in -filelimit and -sizelimit can be specified with symbolic representation. For examples,
1230k = 1230 * 1024 = 1259520
891g = 891 * 1024^3 = 956703965184
hadoop distcp hdfs://A:8020/path/one hdfs://B:8020/path/two
hadoop distcp /path/one /path/two
No comments:
Post a Comment