HDFS命令实现分析
HDFS命令概述
HDFS命令涉及两类,一类是hadoop命令,一类是hdfs命令,功能也分为两类,第一类是HDFS文件操作命令,第二类是HDFS管理命令。
二者都是shell命令,真正的命令只有hadoop和hdfs,而无所谓的ls/mv/cp/cat/mkdir…dfs/setQuota/fsck…等命令,后者都是以入参传递给hadoop和hdfs的。
具体实现参考bin/hadoop和bin/hdfs。hadoop族其他命令如yarn,实现机制类似。官方介绍如下:
The File System (FS) shell includes various shell-like commands that directly interact with the Hadoop Distributed File System (HDFS) as well as other file systems that
Hadoop supports, such as Local FS, WebHDFS, S3 FS, and others. The FS shell is invoked by:
bin/hadoop fs <args>
All FS shell commands take path URIs as arguments. The URI format is scheme://authority/path. For HDFS the scheme is hdfs, and for the Local FS the scheme is file.
The scheme and authority are optional. If not specified, the default scheme specified in the configuration is used. An HDFS file or directory such as /parent/child can be specified as hdfs://namenodehost/parent/child or simply as /parent/child (given that your configuration is set to point to hdfs://namenodehost).
Most of the commands in FS shell behave like corresponding Unix commands. Differences are described with each of the commands. Error information is sent to stderr and the output is sent to stdout.
If HDFS is being used, hdfs dfs is a synonym.
Relative paths can be used. For HDFS, the current working directory is the HDFS home directory /user/<username> that often has to be created manually. The HDFS home directory can also be implicitly accessed, e.g., when using the HDFS trash folder, the .Trash directory in the home directory.
HDFS命令实现机制:
输入命令hadoop/hdfs和参数,经过解析之后,通过执行java或者jsvc启动java程序,获取hdfs文件信息或者传入操作指令,实现文件系统管理。

图1. HDFS命令执行交互过程简介
针对hadoop fs和hdsf dfs/dfsadmin命令,hadoop-3.1.1版本具体实现如下:
相关的shell脚本为bin/hadoop和bin/hdfs;
相关的java类为 org/apache/hadoop/fs/FsShell.java org/apache/hadoop/fs/shell/CommandFactory.java org/apache/hadoop/util/ToolRunner.java org/apache/hadoop/fs/shell/FsCommand.java org/apache/hadoop/util/GenericOptionsParser.java 和 org/apache/hadoop/hdfs/tools/DFSAdmin.java 。
其中DFSAdmin.java继承自FsShell类。
FsShell: 是整个hdfs命令的核心,负责各种操作实现类的注册(通过反射实现)和命令的分发执行。
FsCommand: 各种操作命令注册实现。
CommandFactoryL: 命令注册机制的实现。主要通过工厂模式利用反射实现。
GenericOptionsParser.java: 中间转换操作,实际执行命令的仍是FsShell.run()函数。
GenericOptionsParser.java: 负责命令输入参数的解析。
ToolRunner.java: 执行命令操作,实际上是FsShell.run()执行。
1、命令注册
//org/apache/hadoop/fs/FsShell.java
protected void init() throws IOException {
getConf().setQuietMode(true);
UserGroupInformation.setConfiguration(getConf());
if (commandFactory == null) {
commandFactory = new CommandFactory(getConf());
commandFactory.addObject(new Help(), "-help");
commandFactory.addObject(new Usage(), "-usage");
registerCommands(commandFactory);
}
} protected void registerCommands(CommandFactory factory) {
// TODO: DFSAdmin subclasses FsShell so need to protect the command
// registration. This class should morph into a base class for
// commands, and then this method can be abstract
if (this.getClass().equals(FsShell.class)) {
factory.registerCommands(FsCommand.class);
}
} //org/apache/hadoop/fs/shell/CommandFactory.java
/**
* Invokes "static void registerCommands(CommandFactory)" on the given class.
* This method abstracts the contract between the factory and the command
* class. Do not assume that directly invoking registerCommands on the
* given class will have the same effect.
* @param registrarClass class to allow an opportunity to register
*/
public void registerCommands(Class<?> registrarClass) {
try {
registrarClass.getMethod(
"registerCommands", CommandFactory.class
).invoke(null, this);
} catch (Exception e) {
throw new RuntimeException(StringUtils.stringifyException(e));
}
} //org/apache/hadoop/fs/shell/FsCommand.java
/**
* Register the command classes used by the fs subcommand
* @param factory where to register the class
*/
public static void registerCommands(CommandFactory factory) {
factory.registerCommands(AclCommands.class);
factory.registerCommands(CopyCommands.class);
factory.registerCommands(Count.class);
factory.registerCommands(Delete.class);
factory.registerCommands(Display.class);
factory.registerCommands(Find.class);
factory.registerCommands(FsShellPermissions.class);
factory.registerCommands(FsUsage.class);
factory.registerCommands(Ls.class);
factory.registerCommands(Mkdir.class);
factory.registerCommands(MoveCommands.class);
factory.registerCommands(SetReplication.class);
factory.registerCommands(Stat.class);
factory.registerCommands(Tail.class);
factory.registerCommands(Head.class);
factory.registerCommands(Test.class);
factory.registerCommands(Touch.class);
factory.registerCommands(Truncate.class);
factory.registerCommands(SnapshotCommands.class);
factory.registerCommands(XAttrCommands.class);
}
2、命令解析
/**
* Parse the user-specified options, get the generic options, and modify
* configuration accordingly.
*
* @param opts Options to use for parsing args.
* @param args User-specified arguments
* @return true if the parse was successful
*/
private boolean parseGeneralOptions(Options opts, String[] args)
throws IOException {
opts = buildGeneralOptions(opts);
CommandLineParser parser = new GnuParser();
boolean parsed = false;
try {
commandLine = parser.parse(opts, preProcessForWindows(args), true);
processGeneralOptions(commandLine);
parsed = true;
} catch(ParseException e) {
LOG.warn("options parsing failed: "+e.getMessage()); HelpFormatter formatter = new HelpFormatter();
formatter.printHelp("general options are: ", opts);
}
return parsed;
}
/**
* Retrieve any left-over non-recognized options and arguments
*
* @return remaining items passed in but not parsed as an array
*/
public String[] getArgs()
{
String[] answer = new String[args.size()]; args.toArray(answer); return answer;
}
3、命令执行
/**
* run
*/
@Override
public int run(String argv[]) throws Exception {
// initialize FsShell
init();
Tracer tracer = new Tracer.Builder("FsShell").
conf(TraceUtils.wrapHadoopConf(SHELL_HTRACE_PREFIX, getConf())).
build();
int exitCode = -1;
if (argv.length < 1) {
printUsage(System.err);
} else {
String cmd = argv[0];
Command instance = null;
try {
instance = commandFactory.getInstance(cmd);
if (instance == null) {
throw new UnknownCommandException();
}
TraceScope scope = tracer.newScope(instance.getCommandName());
if (scope.getSpan() != null) {
String args = StringUtils.join(" ", argv);
if (args.length() > 2048) {
args = args.substring(0, 2048);
}
scope.getSpan().addKVAnnotation("args", args);
}
try {
exitCode = instance.run(Arrays.copyOfRange(argv, 1, argv.length));
} finally {
scope.close();
}
} catch (IllegalArgumentException e) {
if (e.getMessage() == null) {
displayError(cmd, "Null exception message");
e.printStackTrace(System.err);
} else {
displayError(cmd, e.getLocalizedMessage());
}
printUsage(System.err);
if (instance != null) {
printInstanceUsage(System.err, instance);
}
} catch (Exception e) {
// instance.run catches IOE, so something is REALLY wrong if here
LOG.debug("Error", e);
displayError(cmd, "Fatal internal error");
e.printStackTrace(System.err);
}
}
tracer.close();
return exitCode;
}
4、格式化输出
/** allows stdout to be captured if necessary */
public PrintStream out = System.out;
/** allows stderr to be captured if necessary */
public PrintStream err = System.err;
/** allows the command factory to be used if necessary */
private CommandFactory commandFactory = null;
HDFS命令实现分析的更多相关文章
- HDFS源码分析数据块校验之DataBlockScanner
DataBlockScanner是运行在数据节点DataNode上的一个后台线程.它为所有的块池管理块扫描.针对每个块池,一个BlockPoolSliceScanner对象将会被创建,其运行在一个单独 ...
- HDFS源码分析心跳汇报之数据块汇报
在<HDFS源码分析心跳汇报之数据块增量汇报>一文中,我们详细介绍了数据块增量汇报的内容,了解到它是时间间隔更长的正常数据块汇报周期内一个smaller的数据块汇报,它负责将DataNod ...
- HDFS源码分析心跳汇报之BPServiceActor工作线程运行流程
在<HDFS源码分析心跳汇报之数据结构初始化>一文中,我们了解到HDFS心跳相关的BlockPoolManager.BPOfferService.BPServiceActor三者之间的关系 ...
- HDFS源码分析心跳汇报之数据块增量汇报
在<HDFS源码分析心跳汇报之BPServiceActor工作线程运行流程>一文中,我们详细了解了数据节点DataNode周期性发送心跳给名字节点NameNode的BPServiceAct ...
- HDfs命令
HDFS命令分为用户命令(dfs,fsck等),管理员命令(dfsadmn,namenode,datanode等) hdfs -ls -lsr 执行lsr 是递归显示 drwxr-xr-x -hado ...
- sodu 命令场景分析
摘自:http://www.cnblogs.com/hazir/p/sudo_command.html sudo 命令情景分析 Linux 下使用 sudo 命令,可以让普通用户也能执行一些或者全 ...
- 4-linux、hdfs命令
定义: linux:Linux是一套免费使用和自由传播的类Unix操作系统,是一个基于POSIX和UNIX的多用户.多任务.支持多线程和多CPU的 操作系统.它能运行主要的UNIX工具软件.应用程序和 ...
- hdfs命令get或者put提示找不到目录或文件
今天用hdfs命令出现个诡异情况: hadoop fs -put a.txt /user/root/ put: `a.txt': No such file or directory 用get命令存在相 ...
- HDFS 命令大全
目录 概要 用户命令 dfs 命令 追加文件内容 查看文件内容 得到文件的校验信息 修改用户组 修改文件权限 修改文件所属用户 本地拷贝到 hdfs hdfs 拷贝到本地 获取目录,文件数量及大小 h ...
随机推荐
- angularjs -- 监听angularJs列表数据是否渲染完毕
前端在做数据渲染的时候经常会遇到在数据渲染完毕后执行某些操作,这几天就一直遇到在列表和表格渲染完毕后,执行点击和选择操作.对于angularjs处理这类问题,最好的方式就是指令 directive. ...
- 前端HTML空格与后台PHP utf-8空格
今天在处理html input输入框时,发现一个问题: 在用户名输入框中输入admin "'p(中间是一个空格),点保存后台提示数据保存成功,按理应该是未修改,通过chrome调试工具发现传 ...
- Python 中单双引号
TODO, 在python中, 其实单双引号还是有分别的, 具体是什么?
- 9.Java注解(Annotation)
一.系统内置标准注解 1.@Override 是一个标记注解类型,它被用作标注方法. 它说明了被标注的方法重载了父类的方法,起到了断言的作用.如果我们使用了这种Annotation在一个没有覆盖父类方 ...
- javascript promise编程
在loop中使用promise: https://stackoverflow.com/questions/17217736/while-loop-with-promises
- .Net 面试题 汇总(四)
1.简述 private. protected. public. internal 修饰符的访问权限.private : 私有成员, 在类的内部才可以访问.protected : 保护成员,该类内部和 ...
- 优化REST Framework 的 路由 APIView 和ViewSetMixin
APIview: 我们经常写的是view 这个APIview继承了我们的view,并且对请求进来的信息进行设置, 在APIView这个例子中,调用了drf本身的serializer以及Respons ...
- vue的项目
vue的项目打开也是非常具有解耦性的 最重要的就是src目录了 我们的入口在main中 main是你的实例化vue app中就是我们的每一块田地是我们的vue实例对这个的操作 ,index因为是 ...
- SQL Sever——妙用种子列
/****** Script for SelectTopNRows command from SSMS ******/ SELECT TOP 1000 [OFFRCD_STATUS_ID] ,[OFF ...
- [Codeforces 321D][2018HN省队集训D4T2] Ciel and Flipboard
[Codeforces 321D][2018HN省队集训D4T2] Ciel and Flipboard 题意 给定一个 \(n\times n\) 的矩阵 \(A\), (\(n\) 为奇数) , ...