hadoop2.2编程：Tool, ToolRunner, GenericOptionsParser, Configuration

继承关系:

1. java.util

Interface Map.Entry<K,V>

description:

public static interface Map.Entry<K,V>

methods:

Modifier and Type	Method and Description
`boolean`	`equals(Object o)` Compares the specified object with this entry for equality.
`K`	`getKey()` Returns the key corresponding to this entry.
`V`	`getValue()` Returns the value corresponding to this entry.
`int`	`hashCode()` Returns the hash code value for this map entry.
`V`	`setValue(V value)` Replaces the value corresponding to this entry with the specified value (optional operation).

2.java.lang.Object

|__ org.apache.hadoop.conf.Configuration

constructor: public class Configuration extends Objectimplements Iterable<Map.Entry<String,String>>, Writable 

3.org.apache.hadoop.util Class ToolRunner java.lang.Object   |__ org.apache.hadoop.util.ToolRunner

description:

public class ToolRunner

extends Object

ToolRunner can be used to run classes implementing Tool interface. It works in conjunction with GenericOptionsParser to parse the generic hadoop command line arguments and modifies the Configuration of the Tool. The application-specific options are passed along without being modified.

methods:

`static int`	`run(Configuration conf, Tool tool, String[] args)` Runs the given `Tool` by `Tool.run(String[])`, after parsing with the given generic arguments.
`static int`	`run(Tool tool, String[] args)` Runs the `Tool` with its `Configuration`.

4.org.apache.hadoop.util 

Interface Tooldescription:

public interface Tool

extends Configurablemethods:

int run(String[] args)
Execute the command with the given arguments.

 5.org.apache.hadoop.conf

Interface Configurable

constructor:

public interface Configurable

methods:

`Configuration`	`getConf()` Return the configuration used by this object.
`void`	`setConf(Configuration conf)` Set the configuration to be used by this object.

6.

java.lang.Object
  |__ org.apache.hadoop.conf.Configureddescription:

public class Configured

extends Objectimplements Configurable

constructor:

Configured()
Construct a Configured.

Configured(Configuration conf)

Construct a Configured

methods:

`Configuration`	`getConf()` Return the configuration used by this object.
`void`	`setConf(Configuration conf)` Set the configuration to be used by this object.

Code1 (Configuration里添加的resource是String类型)：

 import java.util.Map.Entry;

 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.conf.Configured;
 import org.apache.hadoop.util.ToolRunner;
 import org.apache.hadoop.util.Tool;
 import org.apache.hadoop.fs.Path;

 public class ConfigurationPrinter extends Configured implements Tool {
   static {
     Configuration.addDefaultResource("config.xml");
   }

   @Override
   public int run(String[] args) throws Exception {
     Configuration conf = getConf();
     for (Entry<String, String> hash: conf) {
       System.out.printf("%s=%s\n", hash.getKey(), hash.getValue());
     }
     return 0;
   }

   public static void main(String[] args) throws Exception {
     int exitCode = ToolRunner.run(new ConfigurationPrinter(), args);
     System.exit(exitCode);
   }
 }

注：Configuration class提供只一种静态方法：addDefaultresource(String name），如上述代码，添加Resource "config.xml"为String类型时，hadoop将从classpath里查找此文件；若Resource 为Path()类型时，hadoop将从local filesystem里查找此文件： Configuration conf = new Configuration(); conf.addResource(new Path("config.xml"));

code1的执行步骤：

#将自定义的config文件config.xml放在hadoop的$HADOOP_CONF_DIR里
mv config.xml $HADOOP_HOME/etc/hadoop/

#假如我们添加的resource如下：

 <!--cat $HADOOP_HOME/etc/hadoop/config.xml-->
 <configuration>
   <property>
     <name>color</name>
     <value>yellow</value>
   </property>

   <property>
     <name>size</name>
     <value>10</value>
   </property>

   <property>
     <name>weight</name>
     <value>heavy</value>
     <final>true</final>
   </property>
 </configuration>

执行代码：

mkdir class
source $HADOOP_HOME/libexec/hadoop-config.sh
javac  -d class ConfigurationPrinter.java
jar -cvf ConfigurationPrinter.jar -C class ./
export HADOOP_CLASSPATH=ConfigurationPrinter.jar:$CLASSPATH
#下面查找刚才添加的resource是否被读入
#我们在config.xml里添加了一项 <name>color</name>，执行
yarn ConfigurationPrinter|grep "color"
color=yellow
#可见代码是正确的

或者在commandline里指定HADOOP_CONF_DIR，比如执行：

yarn ConfigurationPrinter --conf config.xml | grep color

color=yellow

也是可以的！

Code2 (Configuration里添加的resource是Path类型)：

 import java.util.Map.Entry;

 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.conf.Configured;
 import org.apache.hadoop.util.ToolRunner;
 import org.apache.hadoop.util.Tool;
 import org.apache.hadoop.fs.Path;

 public class ConfigurationPrinter extends Configured implements Tool {
   @Override
   public int run(String[] args) throws Exception {
     Configuration conf = new Configuration();
     conf.addResource(new Path("config.xml"));
     for (Entry<String, String> hash: conf) {
       System.out.printf("%s=%s\n", hash.getKey(), hash.getValue());
     }
     return 0;
   }

   public static void main(String[] args) throws Exception {
     int exitCode = ToolRunner.run(new ConfigurationPrinter(), args);
     System.exit(exitCode);
   }
 }

此时添加的resource类型是Path()类型，故hadoop将从local filesystem里查找config.xml, 不需要将config.xml放在conf/下面，只要在代码中指定config.xml在本地文件系统中的路径即可（new Path("../others/config.xml"））

运行步骤:

mkdir class
source $HADOOP_HOME/libexec/hadoop-config.sh
javac  -d class ConfigurationPrinter.java
jar -cvf ConfigurationPrinter.jar -C class ./
export HADOOP_CLASSPATH=ConfigurationPrinter.jar:$CLASSPATH
#下面查找刚才添加的resource是否被读入
#我们在config.xml里添加了一项 <name>color</name>，执行
yarn ConfigurationPrinter|grep "color"
color=yellow
#可见代码是正确的

备注：ConfigurationParser支持set individual properties:

Generic Options
The supported generic options are:

-conf <configuration file>     specify a configuration file
     -D <property=value>            use value for given property
     -fs <local|namenode:port>      specify a namenode
     -jt <local|jobtracker:port>    specify a job tracker
     -files <comma separated list of files>    specify comma separated
                            files to be copied to the map reduce cluster
     -libjars <comma separated list of jars>   specify comma separated
                            jar files to include in the classpath.
     -archives <comma separated list of archives>    specify comma
             separated archives to be unarchived on the compute machines.

可以尝试：

yarn ConfigurationPrinter -d fuck=Japan | grep fuck
#输出为：
fuck=Japan

再次提醒：

ToolRunner can be used to run classes implementing Tool interface. It works in conjunction with GenericOptionsParser to parse the generic hadoop command line arguments and modifies the Configuration of the Tool. The application-specific options are passed along without being modified.

ToolRunner和GenericOptionsParser共同来（解析|修改） generic hadoop command line arguments （什么是generic hadoop command line arguments？比如：yarn command [genericOptions] [commandOptions]

hadoop2.2编程：Tool, ToolRunner, GenericOptionsParser, Configuration的更多相关文章

hadoop2.2编程：从default mapreduce program 来理解mapreduce
下面写一个default mapreduce 的程序: import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapr ...
hadoop2.2编程：使用MapReduce编程实例（转）
原文链接:http://www.cnblogs.com/xia520pi/archive/2012/06/04/2534533.html 从网上搜到的一篇hadoop的编程实例,对于初学者真是帮助太大 ...
hadoop2.2编程：MRUnit测试
引用地址:http://www.cnblogs.com/lucius/p/3442381.html examples: Overview This document explains how to w ...
hadoop2.2编程：矩阵相乘简单实现
/* matrix-matrix multiplication on Hadoop A x B = C constraint: A, B, C must be of the same size I u ...
hadoop2.2编程：MRUnit
examples: Overview This document explains how to write unit tests for your map reduce code, and test ...
hadoop2.2编程：DFS API 操作
1. Reading data from a hadoop URL 说明:想要让java从hadoop的dfs里读取数据,则java 必须能够识别hadoop hdfs URL schema, 因此我 ...
LPCScrypt, DFUSec : USB FLASH download, programming, and security tool, LPC-Link 2 Configuration tool, Firmware Programming
What does this tool do? The LPC18xx/43xx DFUSec utility is a Windows PC tool that provides support f ...
hadoop2.2编程: SequenceFileWritDemo
import java.io.IOException; import java.net.URI; import org.apache.hadoop.fs.FileSystem; import org. ...
Hadoop2.2编程：新旧API的区别
Hadoop最新版本的MapReduce Release 0.20.0的API包括了一个全新的Mapreduce JAVA API,有时候也称为上下文对象. 新的API类型上不兼容以前的API,所以, ...

随机推荐

【html】【14】特效篇--侧边栏客服
实例参考: http://sc.chinaz.com/tag_jiaoben/zaixiankefu.html 代码: css @charset "utf-8"; ;;list-s ...
android 电话拨号器
电话拨号器(重点) 1.产品经理: 需求分析文档,设计原型图 2.UI工程师: 设计UI界面 3.架构师: 写架构,接口文档 4.码农: 服务端,客户端 ...
linux内核启动参数
Linux内核启动参数 Console Options 参数说明选项内核配置/文件 console=Options 用于说明输出设备 tt ...
细说 ASP.NET Cache 及其高级用法
许多做过程序性能优化的人,或者关注过程程序性能的人,应该都使用过各类缓存技术. 而我今天所说的Cache是专指ASP.NET的Cache,我们可以使用HttpRuntime.Cache访问到的那个Ca ...
WPF MVVM 中怎样在ViewModel总打开的对话框在窗体之前
今天在WPF的项目中,写打印插件,在ViewModel中对需要弹出打印对话框,而对话框如果没有Owner所属的时候经常会被当前应用程序遮住,导致我都不知道到底弹出来没有! 参照:http://www. ...
JAVA获取当前日期以及将字符串转成指定格式的日期
/* * To change this template, choose Tools | Templates * and open the template in the editor. */ pac ...
寻找序列中最小的第N个元素（partition函数实现）
Partition为分割算法,用于将一个序列a[n]分为三部分:a[n]中大于某一元素x的部分,等于x的部分和小于x的部分. Partition程序如下: long Partition (long a ...
Python设计模式——模版方法模式
1.模版方法模式做题的列子: 需求:有两个学生,要回答问题,写出自己的答案 #encoding=utf-8 __author__ = 'kevinlu1010@qq.com' class Stude ...
Ubuntu 下部署asp.net运行环境
在Ubuntu下部署asp.net运行环境,网上教程很多,基本都是编译Mono源码,然后安装jexus.但是可能是我最近RP不太好,编译Mono源码一直都是失败,无奈之下只好找另外的方法安装了. 网上 ...
Shell面试题
1．用Shell编程,判断一文件是不是块或字符设备文件,如果是将其拷贝到 /dev 目录下. #!/bin/bash#1.sh#判断一文件是不是字符或块设备文件,如果是将其拷贝到 /dev 目录下#f ...

hadoop2.2编程：Tool, ToolRunner, GenericOptionsParser, Configuration

hadoop2.2编程：Tool, ToolRunner, GenericOptionsParser, Configuration的更多相关文章

随机推荐

热门专题