1. Hadoop中的NullWritable

NullWritable是Writable的一个特殊类,实现方法为空实现,不从数据流中读数据,也不写入数据,只充当占位符,如在MapReduce中,如果你不需要使用键或值,你就可以将键或值声明为NullWritable,NullWritable是一个不可变的单实例类型。

比如,我设置map的输出为,这样做:

 @Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString();
WebLogBean webLogBean = WebLogParser.parser(line);
WebLogParser.filtStaticResource(webLogBean, pages); // 过滤js/图片/css等静态资源
k.set(webLogBean.toString());
context.write(k, NullWritable.get(););
}

不能使用new NullWritable()来定义,获取空值只能NullWritable.get()来获取

来自https://www.cnblogs.com/Skyar/p/5815486.html

2. Yarn上的各种Id

  • jobId

描述:出自MapReduce,对作业的唯一标识。
格式:job_${clusterStartTime}_${jobid}
例子:job_1498552288473_2742

  • applicationId

描述:在yarn中对作业的唯一标识。
格式:application_${clusterStartTime}_${applicationId}
例子:application_1498552288473_2742

  • taskId

描述:作业中的任务的唯一标识
格式:task_${clusterStartTime}_${applicationId}_[m|r]_${taskId}
例子:task_1498552288473_2742_m_000000、task_1498552288473_2742_r_000000

  • attempId

描述:任务尝试执行的一次id
格式:attempt_${clusterStartTime}_${applicationId}_[m|r]_${taskId}_${attempId}
例子:attempt_1498552288473_2742_m_000000_0

  • appAttempId

描述:ApplicationMaster的尝试执行的一次id。
格式:appattempt_${clusterStartTime}_${applicationId}_${appAttempId}
例子:appattempt_1498552288473_2742_000001

  • containerId

描述:container的id
格式:container_e*epoch*_${clusterStartTime}_${applicationId}_${appAttempId}_${containerId}
例子:container_e20_1498552288473_2742_01_000032、container_1498552288473_2742_01_000032

参考:Yarn之日志分析

3. Yarn的运行日志存放目录以及备份目录

Yarn上任务的运行日志存在哪呢?你有没有疑问过,今天终于明白了,以Flink on yarn上的任务为例(其他任务类似,不一定完全相同),看JobManager.log的截图

 2019-08-05 09:51:55,826 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - --------------------------------------------------------------------------------
2019-08-05 09:51:55,827 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting YarnSessionClusterEntrypoint (Version: 1.7.1, Rev:<unknown>, Date:<unknown>)
2019-08-05 09:51:55,827 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - OS current user: yarn
2019-08-05 09:51:56,255 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Current Hadoop/Kerberos user: worker
2019-08-05 09:51:56,255 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.65-b01
2019-08-05 09:51:56,255 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Maximum heap size: 1388 MiBytes
2019-08-05 09:51:56,255 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JAVA_HOME: /usr/local/jdk/
2019-08-05 09:51:56,258 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Hadoop version: 2.6.0-cdh5.5.0
2019-08-05 09:51:56,258 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JVM Options:
2019-08-05 09:51:56,258 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Xmx1448m
2019-08-05 09:51:56,258 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Dlog.file=/var/log/hadoop/container/application_1564969910131_0252/container_e05_1564969910131_0252_01_000001/jobmanager.log
2019-08-05 09:51:56,258 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Dlogback.configurationFile=file:logback.xml
2019-08-05 09:51:56,258 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Dlog4j.configuration=file:log4j.properties
2019-08-05 09:51:56,258 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Program Arguments: (none)
2019-08-05 09:51:56,259 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Classpath: lib/flink-python_2.11-1.7.1.jar:lib/flink-shaded-hadoop2-uber-1.7.1.jar:lib/log4j-1.2.17.jar:lib/slf4j-log4j12-1.7.15.jar:log4j.properties:logback.xml:flink.jar:flink-conf.yaml::/etc/hadoop/conf.cloudera.yarn4:/run/cloudera-scm-agent/process/47188-yarn-NODEMANAGER:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-annotations.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-auth.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-aws.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-common-tests.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-common.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-nfs.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-nfs-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-common-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-common-2.6.0-cdh5.5.0-tests.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-aws-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-auth-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/hadoop-annotations-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-format.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-format-sources.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-format-javadoc.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-tools.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-thrift.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-test-hadoop2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-scrooge_2.10.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-scala_2.10.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-protobuf.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-pig.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-pig-bundle.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-jackson.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-hadoop.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-hadoop-bundle.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-generator.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-encoding.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-common.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-column.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-cascading.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/parquet-avro.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/hue-plugins-3.9.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-httpclient-3.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-math3-3.1.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-digester-1.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/curator-recipes-2.7.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-net-3.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-configuration-1.6.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/httpclient-4.2.5.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/gson-2.2.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-el-1.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jackson-xc-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/httpcore-4.2.5.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/curator-framework-2.7.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jersey-core-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jaxb-impl-2.2.3-1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jasper-compiler-5.5.23.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/htrace-core4-4.0.1-incubating.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-lang-2.6.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jsp-api-2.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jetty-util-6.1.26.cloudera.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jets3t-0.9.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jaxb-api-2.2.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/hamcrest-core-1.3.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-logging-1.1.3.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/apacheds-i18n-2.0.0-M15.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/zookeeper.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/avro.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/stax-api-1.0-2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/protobuf-java-2.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/mockito-all-1.8.5.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/log4j-1.2.17.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jsr305-3.0.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jetty-6.1.26.cloudera.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jersey-server-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/java-xmlbuilder-0.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jackson-jaxrs-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/guava-11.0.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-io-2.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/xmlenc-0.52.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/snappy-java-1.0.4.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/servlet-api-2.5.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/paranamer-2.3.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-compress-1.4.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-collections-3.2.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-codec-1.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-cli-1.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-beanutils-core-1.8.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/commons-beanutils-1.7.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/aws-java-sdk-1.7.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/asm-3.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/api-util-1.0.0-M20.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/api-asn1-api-1.0.0-M20.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/activation-1.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/slf4j-log4j12.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/xz-1.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/slf4j-api-1.7.5.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/netty-3.6.2.Final.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/logredactor-1.0.3.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/junit-4.11.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jsch-0.1.42.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jettison-1.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jersey-json-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jasper-runtime-5.5.23.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/jackson-core-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/lib/curator-client-2.7.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/hadoop-hdfs-nfs.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/hadoop-hdfs-tests.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/hadoop-hdfs.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/hadoop-hdfs-nfs-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/hadoop-hdfs-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/hadoop-hdfs-2.6.0-cdh5.5.0-tests.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/xmlenc-0.52.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/xml-apis-1.3.04.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/xercesImpl-2.9.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/servlet-api-2.5.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/protobuf-java-2.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/netty-3.6.2.Final.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/log4j-1.2.17.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/leveldbjni-all-1.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/jsr305-3.0.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/jsp-api-2.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/jetty-util-6.1.26.cloudera.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/jetty-6.1.26.cloudera.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/jersey-server-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/jersey-core-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/jasper-runtime-5.5.23.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/jackson-mapper-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/jackson-core-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/htrace-core4-4.0.1-incubating.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/guava-11.0.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/commons-logging-1.1.3.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/commons-lang-2.6.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/commons-io-2.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/commons-el-1.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/commons-daemon-1.0.13.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/commons-codec-1.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/commons-cli-1.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-hdfs/lib/asm-3.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-api.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-applications-unmanaged-am-launcher.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-client.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-common.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-registry.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-applicationhistoryservice.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-common.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-nodemanager.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-resourcemanager.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-tests.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-web-proxy.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-web-proxy-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-tests-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-resourcemanager-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-nodemanager-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-common-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-server-applicationhistoryservice-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-registry-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-common-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-client-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/hadoop-yarn-api-2.6.0-cdh5.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/spark-yarn-shuffle.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/xz-1.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/servlet-api-2.5.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/protobuf-java-2.5.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/log4j-1.2.17.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/leveldbjni-all-1.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jsr305-3.0.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jline-2.11.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jetty-util-6.1.26.cloudera.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jetty-6.1.26.cloudera.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jettison-1.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jersey-server-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jersey-json-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jersey-guice-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jersey-core-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jersey-client-1.9.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jaxb-impl-2.2.3-1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jaxb-api-2.2.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/javax.inject-1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jackson-xc-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jackson-mapper-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jackson-jaxrs-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/jackson-core-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/guice-servlet-3.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/guice-3.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/guava-11.0.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/commons-logging-1.1.3.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/commons-lang-2.6.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/commons-io-2.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/commons-httpclient-3.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/commons-compress-1.4.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/commons-collections-3.2.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/commons-codec-1.4.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/commons-cli-1.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/asm-3.2.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/aopalliance-1.0.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/activation-1.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/zookeeper.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/spark-1.5.0-cdh5.5.0-yarn-shuffle.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop-yarn/lib/stax-api-1.0-2.jar
2019-08-05 09:51:56,260 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - --------------------------------------------------------------------------------

看见了吧,去JobManager那台机器上对应目录下去找,可以看到日志所在文件夹

 总用量 16K
drwx--x--- 3 yarn yarn 63 7月 25 17:58 application_1564048714696_0300
drwx--x--- 3 yarn yarn 63 7月 28 13:39 application_1564110901389_0235
drwx--x--- 4 yarn yarn 116 8月 5 09:52 application_1564969910131_0252
drwx--x--- 3 yarn yarn 63 8月 5 20:35 application_1564998210570_0394

这个是运行时日志,如果任务停止了呢,Yarn一般会有备份的,比如我这边用的Cloudera,有个配置

然后你用

 hadoop fs -ls /tmp/logs/用户名/logs

发现日志都在。

4. Hadoop中的主要配置文件

core-site.xml:hadoop全局配置:HDFS路径 ,临时数据的公共目录,ZooKeeper集群的地址和端口。

<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/data/hadoop/tmp</value>
<!-- 其他临时目录的父目录 -->
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop-alone:9000</value>
<!--
hdfs://host:port/
默认的文件系统的名称。通常指定namenode的URI地址,包括主机和端口
-->
</property>
<property>
<name>io.file.buffer.size</name>
<value>4096</value>
<!--
在序列文件中使用的缓冲区大小,这个缓冲区的大小应该是页大小(英特尔x86上为4096)的倍数
他决定读写操作中缓冲了多少数据(单位kb)
-->
</property>
<!--ZooKeeper集群的地址和端口。注意,数量一定是奇数,且不少于三个节点-->
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop1:2181,hadoop2:2181,hadoop3:2181,hadoop4:2181,hadoop5:2181</value>
</property>
</configuration>

hdfs-site.xml:HDFS文件系统属性:dfs.replication(文件块的数据备份数), dfs.data.dir(datanode节点存储在文件系统的目录) , dfs.name.dir(namenode节点存储在文件系统的目录) 。

<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
<!--指定dataNode存储block的副本数量,默认值是3个,该值应该不大于4-->
</property>
<property>
<name>dfs.blocksize</name>
<value>268435456</value>
<!--大型的文件系统HDFS块大小为256MB,先默认是128MB-->
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file://${hadoop.tmp.dir}/dfs/name</value>
<!--
存放namenode的名称表(fsimage)的目录,如果这是一个逗号分隔的目录列表,
那么在所有目录中复制名称表,用于冗余。
-->
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file://${hadoop.tmp.dir}/dfs/data</value>
<!--
存放datanode块的目录。如果这是一个逗号分隔的目录列表,
那么数据将存储在所有命名的目录中,通常存储在不同的设备上。
-->
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>0.0.0.0:50070</value>
<!---->
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>0.0.0.0:50090</value>
<!--secondary namenode HTTP服务器地址和端口。-->
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
<!-- 当为true时,则允许HDFS的检测,当为false时,则关闭HDFS的检测,但不影响其它HDFS的其它功能。-->
</property>
<property>
<name>dfs.nameservices</name>
<value>hadoop-cluster1</value>
<!--给hdfs集群起名字 -->
</property>
<property>
<name>dfs.namenode.handler.count</name>
<value>100</value>
<!-- RPC服务器的监听client线程数,如果dfs.namenode.servicerpc-address属性没有配置,则线程会监听所有节点的请求。-->
</property>
</configuration>

yarn-site.xml:设置Yarn的参数,NodeManager、ResourceManager等

<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoop-alone:8033</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop-alone:8025</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop-alone:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoop-alone:8050</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop-alone:8030</value>
</property>
</configuration>

mapred-site.xml:指明MapReduce的运行框架为YARN;

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<!--执行框架设置为 Hadoop YARN.-->
</property>
</configuration>

hadoop-env.sh:设置jdk的安装路径

Hadoop的NullWritable的更多相关文章

  1. [Hadoop in Action] 第7章 细则手册

    向任务传递定制参数 获取任务待定的信息 生成多个输出 与关系数据库交互 让输出做全局排序   1.向任务传递作业定制的参数        在编写Mapper和Reducer时,通常会想让一些地方可以配 ...

  2. Hadoop学习笔记—12.MapReduce中的常见算法

    一.MapReduce中有哪些常见算法 (1)经典之王:单词计数 这个是MapReduce的经典案例,经典的不能再经典了! (2)数据去重 "数据去重"主要是为了掌握和利用并行化思 ...

  3. Hadoop学习笔记—15.HBase框架学习(基础实践篇)

    一.HBase的安装配置 1.1 伪分布模式安装 伪分布模式安装即在一台计算机上部署HBase的各个角色,HMaster.HRegionServer以及ZooKeeper都在一台计算机上来模拟. 首先 ...

  4. Hadoop学习笔记—20.网站日志分析项目案例(二)数据清洗

    网站日志分析项目案例(一)项目介绍:http://www.cnblogs.com/edisonchou/p/4449082.html 网站日志分析项目案例(二)数据清洗:当前页面 网站日志分析项目案例 ...

  5. hadoop实战 -- 网站日志KPI指标分析

    本项目分析apache服务器产生的日志,分析pv.独立ip数和跳出率等指标.其实这些指标在第三方系统中都可以检测到,在生产环境中通常用来分析用户交易等核心数据,此处只是用于演示说明日志数据的分析流程. ...

  6. hadoop 多表join:Map side join及Reduce side join范例

    最近在准备抽取数据的工作.有一个id集合200多M,要从另一个500GB的数据集合中抽取出所有id集合中包含的数据集.id数据集合中每一个行就是一个id的字符串(Reduce side join要在每 ...

  7. Hadoop日记Day12---MapReduce学习

    一.MapReduce简介 1.1MapReduce概述 MapReduce是一种分布式计算模型,由Google提出,主要用于搜索领域,解决海量数据的计算问题.MR由两个阶段组成:Map和Reduce ...

  8. Hadoop日记Day18---MapReduce排序分组

    本节所用到的数据下载地址为:http://pan.baidu.com/s/1bnfELmZ MapReduce的排序分组任务与要求 我们知道排序分组是MapReduce中Mapper端的第四步,其中分 ...

  9. Hadoop: MapReduce2的几个基本示例

    1) WordCount 这个就不多说了,满大街都是,网上有几篇对WordCount的详细分析 http://www.sxt.cn/u/235/blog/5809 http://www.cnblogs ...

随机推荐

  1. MaxCompute安全管理指南-基础篇

    背景及目的 方便和辅助MaxCompute的project owner或安全管理员进行project的日常安全运维,保障数据安全. MaxCompute有安全模型,DataWorks也有安全模型,当通 ...

  2. Tapestry3.0开发概论

    一.TAPESTRY技术的综述. tapestry是平行于struts/Spring MVC/webwork /jsf等开发工具的语言.tapestry是基于组件的开发工具.它具有组件的可重用性.因为 ...

  3. 支持异步写入的日志类,支持Framework2.0

    因为工作需要需要在XP上运行一个C#编写的Winform插件,我就用Framework2.0,因为存在接口交互所以想保留交易过程的入参出参. 考虑到插件本身实施的因素,就没有使用Log4.NLog等成 ...

  4. #5 Python面向对象(四)

    前言 本节将是Python面向对象的最后一篇博文了,这节将记录类的特殊方法.特殊成员方法.旧类和新类的不同,以及一些其他知识.Go! 一.类的特殊方法 Python有三种特殊方法:实例方法.静态方法. ...

  5. MySQL在线DDL gh-ost 使用说明

    背景: 作为一个DBA,大表的DDL的变更大部分都是使用Percona的pt-online-schema-change,本文说明下另一种工具gh-ost的使用:不依赖于触发器,是因为他是通过模拟从库, ...

  6. [转]eShopOnContainers 看微服务 ①:总体概览

    本文转自:https://www.cnblogs.com/tianyamoon/p/10081177.html 一.简介 eShopOnContainers是一个简化版的基于.NET Core和Doc ...

  7. 基于C#&.net2.0的windows服务创建与安装

    起因:一台服务器中部署的程序,停电后未按照计划任务正常启动. 一.创建并配置Windows服务程序 开发工具VS2015 Framework版本2.0 1.新建Windows服务 2.在Service ...

  8. 如何去掉C#字符串中的所有空格(转载)

    如何去掉C#字符串中的所有空格 来源:https://www.cnblogs.com/donchen/p/8966059.html 字符串行数Trim()可以去掉字符串前后的空格,如:  C# Cod ...

  9. [PHP] strpos stripos strrpos strripos的区别

    stripos — 查找字符串首次出现的位置(不区分大小写),应使用 === 运算符来测试此函数的返回值 strpos 左边开始字符出现的第一个位置,区分大小写stripos 和上面一样,不区分大小写 ...

  10. [PHP] defunct僵尸进程

    1.如果子进程先于父进程退出, 同时父进程又没有调用wait/waitpid,则该子进程将成为僵尸进程 2.如果fork完就不管了可以使用 忽略子进程信号, 防止僵尸进程 pcntl_signal(S ...