Windows平台开发Mapreduce程序远程调用运行在Hadoop集群

共享原因：虽然用一篇博文写问题感觉有点奢侈，但是搜索百度，相关文章太少了，苦苦探寻日志才找到解决方案。

遇到问题：在windows平台上开发的mapreduce程序，运行迟迟没有结果。

Mapreduce程序

public class Test {

    public static void main(String [] args) throws Exception{

        Configuration conf = new Configuration();

       conf.set("fs.defaultFS", "hdfs://master:9000/");
    conf.set(</span>"mapreduce.job.jar", "D:/intelij-workspace/aaron-bigdata/aaorn-mapreduce/target/aaorn-mapreduce-1.0-SNAPSHOT.jar"<span style="color: #000000;">.trim());

    conf.set(</span>"mapreduce.framework.name", "yarn"<span style="color: #000000;">);

    conf.set(</span>"yarn.resourcemanager.hostname", "master"<span style="color: #000000;">);

    conf.set(</span>"mapreduce.app-submission.cross-platform", "true"<span style="color: #000000;">);

    Job job </span>=<span style="color: #000000;"> Job.getInstance(conf);

    job.setMapperClass(WordCountMapper.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">);

    job.setReducerClass(WordCountReducer.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">);

    job.setMapOutputKeyClass(Text.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">);

    job.setMapOutputValueClass(LongWritable.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">);

    job.setOutputKeyClass(Text.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">);

    job.setOutputValueClass(LongWritable.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">);

    FileInputFormat.setInputPaths(job,</span>"hdfs://master:9000/input/"<span style="color: #000000;">);

    FileOutputFormat.setOutputPath(job,</span><span style="color: #0000ff;">new</span> Path("hdfs://master:9000/output3/"<span style="color: #000000;">));

    job.waitForCompletion(</span><span style="color: #0000ff;">true</span><span style="color: #000000;">);

}


}

运行结果

[QC] INFO [main] org.apache.hadoop.yarn.client.RMProxy.createRMProxy(98) | Connecting to ResourceManager at master/192.168.56.100:8032

[QC] WARN [main] org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(64) | Hadoop

command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.

[QC] INFO [main] org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(283) | Total input paths to process : 2

[QC] INFO [main] org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(198) | number of splits:2

[QC] INFO [main] org.apache.hadoop.mapreduce.JobSubmitter.printTokens(287) | Submitting tokens for job: job_1496627557122_0004

[QC] INFO [main] org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(273) | Submitted application application_1496627557122_0004

[QC] INFO [main] org.apache.hadoop.mapreduce.Job.submit(1294) | The url to track the job: http://master:8088/proxy/application_1496627557122_0004/

[QC] INFO [main] org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(1339) | Running job: job_1496627557122_0004

Master(NameNode)日志

java.io.IOException: Connection reset by peer

        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)

        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)

        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)

        at sun.nio.ch.IOUtil.read(IOUtil.java:197)

        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)

        at org.apache.hadoop.ipc.Server.channelRead(Server.java:2603)

        at org.apache.hadoop.ipc.Server.access$2800(Server.java:136)

        at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1481)

        at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:771)

        at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:637)

        at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:608

Slave(DataNode)的日志异常

2017-06-05 09:49:40,464 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2017-06-05 09:49:41,464 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2017-06-05 09:49:42,465 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2017-06-05 09:49:43,467 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2017-06-05 09:49:44,468 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2017-06-05 09:49:45,470 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2017-06-05 09:49:46,472 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2017-06-05 09:49:47,474 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

说明

我的hadoop集群是Master(namenode)、Slave1、Slave2、Slave3

解决办法

在所有的Slave机器的yarn-site.xml,之前我只在Master机器上添加了这些内容

<configuration>

　　<property>

    　　<name>yarn.resourcemanager.hostname</name>

    　　<value>master</value>

　　</property>

　　<property>

    　　<name>yarn.nodemanager.aux-services</name>

    　　<value>mapreduce_shuffle</value>

　　</property>

　　<property>

    　　<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>

    　　<value>org.apache.hadoop.mapred.ShuffleHandler</value>

　　</property>

</configuration>

Windows平台开发Mapreduce程序远程调用运行在Hadoop集群—Yarn调度引擎异常的更多相关文章

[MapReduce_add_1] Windows 下开发 MapReduce 程序部署到集群
0. 说明 Windows 下开发 MapReduce 程序部署到集群 1. 前提在本地开发的时候保证 resource 中包含以下配置文件,从集群的配置文件中拷贝在 resource 中新建 ...
hadoop 把mapreduce任务从本地提交到hadoop集群上运行
MapReduce任务有三种运行方式: 1.windows(linux)本地调试运行,需要本地hadoop环境支持 2.本地编译成jar包,手动发送到hadoop集群上用hadoop jar或者yar ...
在windows远程提交任务给Hadoop集群（Hadoop 2.6）
我使用3台Centos虚拟机搭建了一个Hadoop2.6的集群.希望在windows7上面使用IDEA开发mapreduce程序,然后提交的远程的Hadoop集群上执行.经过不懈的google终于搞定 ...
使用Windows Azure的VM安装和配置CDH搭建Hadoop集群
本文主要内容是使用Windows Azure的VIRTUAL MACHINES和NETWORKS服务安装CDH (Cloudera Distribution Including Apache Hado ...
windows环境：idea或者eclipse指定用户名操作hadoop集群
方法在系统的环境变量或java JVM变量添加HADOOP_USER_NAME(具体值视情况而定). 比如:idea里面可以如下添加HADOOP_USER_NAME=hdfs 原理:直接看源码 /h ...
运行基准测试hadoop集群中的问题：org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /benchmarks/TestDFSIO/io_data/test_
在master(即:host2)中执行 hadoop jar hadoop-test-1.1.2.jar DFSCIOTest -write -nrFiles 12 -fileSize 10240 - ...
dotNET使用DRPC远程调用运行在Storm上的Topology
Distributed RPC(DRPC)是Storm构建在Thrift协议上的RPC的实现,DRPC使得你可以通过多种语言远程的使用Storm集群的计算能力.DRPC并非Storm的基础特性,但它确 ...
Hadoop集群运行JNI程序
要在Hadoop集群运行上运行JNI程序,首先要在单机上调试程序直到可以正确运行JNI程序,之后移植到Hadoop集群就是水到渠成的事情. Hadoop运行程序的方式是通过jar包,所以我们需要将所有 ...
本地idea开发mapreduce程序提交到远程hadoop集群执行
https://www.codetd.com/article/664330 https://blog.csdn.net/dream_an/article/details/84342770 通过idea ...

随机推荐

tensorflow随机张量创建
TensorFlow 有几个操作用来创建不同分布的随机张量.注意随机操作是有状态的,并在每次评估时创建新的随机值. 下面是一些相关的函数的介绍: tf.random_normal 从正态分布中输出随机 ...
python技巧使用值来排序一个字典
In [8]: a={'x':11,'y':22,'c':4} In [9]: import operator In [10]: sorted(a.items(),key=operator.itemg ...
一主多从+Binlog Server，主库故障无法访问，如何在从库中选举一个新主库
一.基本环境 VMware10.0+CentOS6.9+MySQL5.7.19 ROLE HOSTNAME BASEDIR DATADIR IP PORT M ZST1 /usr/local/mysq ...
线段树->面积并 Atlantis HDU - 1542
题目链接:https://cn.vjudge.net/problem/HDU-1542 题目大意:求面积并具体思路:我们首先把矩形分割成一横条一横条的,然后对于每一个我们给定的矩形,我们将储存两个点 ...
springcloud使用Hystrix实现微服务的容错处理
使用Hystrix实现微服务的容错处理容错机制如果服务提供者相应非常缓慢,那么消费者对提供者的请求就会被强制等待,知道提供者相应超时.在高负载场景下,如果不作任何处理,此类问题可能会导致服务消费者 ...
解决UnicodeDecodeError: 'ascii' codec can't decode byte 0xcf in position 7: ordinal not in range(128)
在Windows下同时装了Python2和Python3,但是在使用命令给pip更新的时候,出现了以下错误: 解决办法:修改mimetypes.py文件,路径位于python的安装路径下的Lib\mi ...
Session,Token相关区别
1. 为什么要有session的出现?答:是由于网络中http协议造成的,因为http本身是无状态协议,这样,无法确定你的本次请求和上次请求是不是你发送的.如果要进行类似论坛登陆相关的操作,就实现不了 ...
Ibatis.Net 数据库操作学习(四)
一.查询select 还记得第一篇示例中是如何读出数据库里3条数据的吗?就是调用了一个QueryForList方法,从方法名就知道,查询返回列表. 1.QueryForList 返回List< ...
dede列表页读取当前栏目名称
list或者arclist之内使用[field:typename/]之外使用{dede:field name='typename'/}
pathon 基础学习-集合(set)，单双队列，深浅copy，内置函数
一.collections系列: collections其实是python的标准库,也就是python的一个内置模块,因此使用之前导入一下collections模块即可,collections在pyt ...

Windows平台开发Mapreduce程序远程调用运行在Hadoop集群—Yarn调度引擎异常

Windows平台开发Mapreduce程序远程调用运行在Hadoop集群—Yarn调度引擎异常的更多相关文章

随机推荐

热门专题