序:本以为今天花点时间将WordCount例子完全理解到,但高估自己了,更别说我只是在大学选修一学期的java,之后再也没碰过java语言了

总的来说,从宏观上能理解具体的程序思路,但具体到每个代码有什么作用,什么原理,那还需要花点时间,毕竟需要一点java基础和hadoop的运行机制的知识

首先启动hadoop;

[hadoop@hadoop01 eclipse]$ cd ~/hadoop-3.2.0
[hadoop@hadoop01 hadoop-3.2.0]$ sbin/start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as hadoop in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [hadoop01]
Starting datanodes
Starting secondary namenodes [hadoop01]
Starting resourcemanager
Starting nodemanagers
[hadoop@hadoop01 hadoop-3.2.0]$ jps
8497 NameNode
9121 ResourceManager
8868 SecondaryNameNode
9268 NodeManager
9630 Jps

然后,进入root权限打开eclipse;

[hadoop@hadoop01 hadoop-3.2.0]$ su root
Password:
[root@hadoop01 hadoop-3.2.0]# cd ..
[root@hadoop01 hadoop]# cd eclipse
[root@hadoop01 eclipse]# ./eclipse

在eclipse的window里面show view打开terminal;

在eclipse中点击打开open a terminal,在终端中输入命令:gedit input.txt;

在文档中任意输入内容;

在终端中输入命令:hadoop fs -put /home/hadoop/input.txt /test/;

最后,file--new--project--MapReduce project并取项目名“Wordcount”,再从创建的文件下src中new--package并为包取名“com.hadoop”,又在src下new--class并为类取名“Wordcount”,然后将下面的代码粘贴进去。

然后可以run as hadoop,成功运行得到计算结果

注:若package下无log4j.properties,会报错,需在该文件下手动添加该文件。
内容 如下:

# Configure logging for testing: optionally with log file

#log4j.rootLogger=debug,appender
log4j.rootLogger=info,appender
#log4j.rootLogger=error,appender #\u8F93\u51FA\u5230\u63A7\u5236\u53F0
log4j.appender.appender=org.apache.log4j.ConsoleAppender
#\u6837\u5F0F\u4E3ATTCCLayout
log4j.appender.appender.layout=org.apache.log4j.TTCCLayout

附代码:

/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.examples; import java.io.IOException;
import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser; public class WordCount { public static class TokenizerMapper
extends Mapper<Object, Text, Text, IntWritable>{ private final static IntWritable one = new IntWritable(1);
private Text word = new Text(); public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
}
}
} public static class IntSumReducer
extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values,
Context context
) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
} public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
if (otherArgs.length < 2) {
System.err.println("Usage: wordcount <in> [<in>...] <out>");
System.exit(2);
}
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
for (int i = 0; i < otherArgs.length - 1; ++i) {
FileInputFormat.addInputPath(job, new Path(otherArgs[i]));
}
FileOutputFormat.setOutputPath(job,
new Path(otherArgs[otherArgs.length - 1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}

【hadoop】在eclipse上运行WordCount的操作过程的更多相关文章

  1. linux下在eclipse上运行hadoop自带例子wordcount

    启动eclipse:打开windows->open perspective->other->map/reduce 可以看到map/reduce开发视图.设置Hadoop locati ...

  2. MapReduce编程入门实例之WordCount:分别在Eclipse和Hadoop集群上运行

    上一篇博文如何在Eclipse下搭建Hadoop开发环境,今天给大家介绍一下如何分别分别在Eclipse和Hadoop集群上运行我们的MapReduce程序! 1. 在Eclipse环境下运行MapR ...

  3. 在Eclipse上运行Spark(Standalone,Yarn-Client)

    欢迎转载,且请注明出处,在文章页面明显位置给出原文连接. 原文链接:http://www.cnblogs.com/zdfjf/p/5175566.html 我们知道有eclipse的Hadoop插件, ...

  4. Spark源码编译并在YARN上运行WordCount实例

    在学习一门新语言时,想必我们都是"Hello World"程序开始,类似地,分布式计算框架的一个典型实例就是WordCount程序,接触过Hadoop的人肯定都知道用MapRedu ...

  5. mac上eclipse上运行word count

    1.打开eclipse之后,建立wordcount项目 package wordcount; import java.io.IOException; import java.util.StringTo ...

  6. 解决在windows的eclipse上面运行WordCount程序出现的一系列问题详解

    一.简介 要在Windows下的 Eclipse上调试Hadoop2代码,所以我们在windows下的Eclipse配置hadoop-eclipse-plugin- 2.6.0.jar插件,并在运行H ...

  7. 在Hadoop 2.3上运行C++程序各种疑难杂症(Hadoop Pipes选择、错误集锦、Hadoop2.3编译等)

    首记 感觉Hadoop是一个坑,打着大数据最佳解决方案的旗帜到处坑害良民.记得以前看过一篇文章,说1TB以下的数据就不要用Hadoop了,体现不 出太大的优势,有时候反而会成为累赘.因此Hadoop的 ...

  8. hadoop 把mapreduce任务从本地提交到hadoop集群上运行

    MapReduce任务有三种运行方式: 1.windows(linux)本地调试运行,需要本地hadoop环境支持 2.本地编译成jar包,手动发送到hadoop集群上用hadoop jar或者yar ...

  9. Hadoop在window上运行 user=Administrator, access=WRITE, inode="hadoop"

    win7下eclipse中错误的详细描述如下: org.apache.hadoop.security.AccessControlException: org.apache.hadoop.securit ...

随机推荐

  1. Redis自定义fastJson Serializer

    public class FastJsonRedisSerializer<T> implements RedisSerializer<T> { public static fi ...

  2. rabbitmq - 消息接收,解析xml格式数据时异常:ERROR not well-formed (invalid token): line 4, column 46

    ERROR alsv odoo.addons.cus_alsv.utils.alsv_about_mq.get_data_from_mq: parse_xml_data_from_mq: not we ...

  3. echo * 和ls *之间的区别?

    背景描述: 今天 一同事做入职考试,涉及到1题目,echo * 和ls *之间的区别,没有用过这个用法,再次记录下. 操作过程: 1.执行echo * [root@localhost ~]# echo ...

  4. Linux服务器连接不上的几种解决办法

    Linux远程服务器连接不上,或连接超时解决办法:1.测试网络是否通:    ping 远程IP 2.如果能ping通则表示与服务器网络连接是正常,接下来测试端口:telnet 远程ip 端口 3.如 ...

  5. 史上最全的 DB2 错误代码大全

    1 前言 作为一个程序员,数据库是我们必须掌握的知识,经常操作数据库不可避免,but,在写 SQL 语句的时候,难免遇到各种问题.例如,当我们看着数据库报出的一大堆错误时,是否有种两眼发蒙的感觉呢?咳 ...

  6. CefSharp 提示 flash player is out of date 运行此插件 等问题解决办法

    CefSharp 提示 flash player is out of date 或者 需要手动右键点 运行此插件 脚本 等问题解决办法 因为中国版FlashPlayer变得Ad模式之后,只好用旧版本的 ...

  7. 搭建SpringCloud微服务

    建立spring父模块 删除不必要的src目录 父模块中的pom.xml中添加相应的依赖以及插件.远程仓库地址 <!-- 项目的打包类型, 即项目的发布形式, 默认为 jar. 对于聚合项目的父 ...

  8. Python3.7安装(解决ssl问题)

    摘自:https://blog.csdn.net/love_cjiajia/article/details/82254371 python3.7安装(解决ssl的问题) 1) 安装准备 yum -y ...

  9. [LeetCode] 150. Evaluate Reverse Polish Notation 计算逆波兰表达式

    Evaluate the value of an arithmetic expression in Reverse Polish Notation. Valid operators are +, -, ...

  10. django:下拉框二级联动实现

    注意:只列举核心部分代码 前台模板: 第一级下拉菜单: <div class="col-sm-4"> <select data-placeholder=" ...