Hadoop 2.7.3 完全分布式维护-简单测试篇
1. 测试MapReduce Job
1.1 上传文件到hdfs文件系统
$ jps
Jps
SecondaryNameNode
JobHistoryServer
NameNode
ResourceManager
$ jps > infile
$ hadoop fs -mkdir /inputdir
$ hadoop fs -put infile /inputdir
$ hadoop fs -ls /inputdir
Found items
-rw-r--r-- hduser supergroup -- : /inputdir/infile
1.2 进行word count计算
$ hadoop jar /usr/local/hadoop-2.7./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7..jar wordcount /inputdir /outputdir
// :: INFO client.RMProxy: Connecting to ResourceManager at /172.16.101.55:
// :: INFO input.FileInputFormat: Total input paths to process :
// :: INFO mapreduce.JobSubmitter: number of splits:
// :: INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1504106569900_0001
// :: INFO impl.YarnClientImpl: Submitted application application_1504106569900_0001
// :: INFO mapreduce.Job: The url to track the job: http://sht-sgmhadoopnn-01:8088/proxy/application_1504106569900_0001/
// :: INFO mapreduce.Job: Running job: job_1504106569900_0001
// :: INFO mapreduce.Job: Job job_1504106569900_0001 running in uber mode : false
// :: INFO mapreduce.Job: map % reduce %
// :: INFO mapreduce.Job: map % reduce %
// :: INFO mapreduce.Job: map % reduce %
// :: INFO mapreduce.Job: Job job_1504106569900_0001 completed successfully
// :: INFO mapreduce.Job: Counters:
File System Counters
FILE: Number of bytes read=
FILE: Number of bytes written=
FILE: Number of read operations=
FILE: Number of large read operations=
FILE: Number of write operations=
HDFS: Number of bytes read=
HDFS: Number of bytes written=
HDFS: Number of read operations=
HDFS: Number of large read operations=
HDFS: Number of write operations=
Job Counters
Launched map tasks=
Launched reduce tasks=
Data-local map tasks=
Total time spent by all maps in occupied slots (ms)=
Total time spent by all reduces in occupied slots (ms)=
Total time spent by all map tasks (ms)=
Total time spent by all reduce tasks (ms)=
Total vcore-milliseconds taken by all map tasks=
Total vcore-milliseconds taken by all reduce tasks=
Total megabyte-milliseconds taken by all map tasks=
Total megabyte-milliseconds taken by all reduce tasks=
Map-Reduce Framework
Map input records=
Map output records=
Map output bytes=
Map output materialized bytes=
Input split bytes=
Combine input records=
Combine output records=
Reduce input groups=
Reduce shuffle bytes=
Reduce input records=
Reduce output records=
Spilled Records=
Shuffled Maps =
Failed Shuffles=
Merged Map outputs=
GC time elapsed (ms)=
CPU time spent (ms)=
Physical memory (bytes) snapshot=
Virtual memory (bytes) snapshot=
Total committed heap usage (bytes)=
Shuffle Errors
BAD_ID=
CONNECTION=
IO_ERROR=
WRONG_LENGTH=
WRONG_MAP=
WRONG_REDUCE=
File Input Format Counters
Bytes Read=
File Output Format Counters
Bytes Written=
1.3 查看wordcount结果
$ hadoop fs -ls /outputdir
Found items
-rw-r--r-- hduser supergroup -- : /outputdir/_SUCCESS
-rw-r--r-- hduser supergroup -- : /outputdir/part-r-
$ hadoop fs -cat /outputdir/part-r- JobHistoryServer
Jps
NameNode
ResourceManager
SecondaryNameNode
2. 测试hdfs分布式存储
2.1 上传测试文件
$ ls -lh hadoop-2.7..tar.gz
-rw-r--r-- root root 205M May : hadoop-2.7..tar.gz
$ hadoop fs -put hadoop-2.7..tar.gz /inputdir
$ hadoop fs -ls -h /inputdir
Found items
-rw-r--r-- hduser supergroup 204.2 M -- : /inputdir/hadoop-2.7..tar.gz
-rw-r--r-- hduser supergroup -- : /inputdir/infile
2.2 查看datanode副本信息

Hadoop 2.7.3 完全分布式维护-简单测试篇的更多相关文章
- Hadoop 2.7.3 完全分布式维护-部署篇
测试环境如下 IP host JDK linux hadop role 172.16.101.55 sht-sgmhadoopnn-01 1.8.0_111 CentOS release ...
- Hadoop 2.7.3 完全分布式维护-动态增加datanode篇
原有环境 http://www.cnblogs.com/ilifeilong/p/7406944.html IP host JDK linux hadop role 172.16.101 ...
- 安装部署Apache Hadoop (本地模式和伪分布式)
本节内容: Hadoop版本 安装部署Hadoop 一.Hadoop版本 1. Hadoop版本种类 目前Hadoop发行版非常多,有华为发行版.Intel发行版.Cloudera发行版(CDH)等, ...
- Hadoop Single Node Setup(hadoop本地模式和伪分布式模式安装-官方文档翻译 2.7.3)
Purpose(目标) This document describes how to set up and configure a single-node Hadoop installation so ...
- ZooKeeper分布式锁简单实践
ZooKeeper分布式锁简单实践 在分布式解决方案中,Zookeeper是一个分布式协调工具.当多个JVM客户端,同时在ZooKeeper上创建相同的一个临时节点,因为临时节点路径是保证唯一,只要谁 ...
- Hadoop平台K-Means聚类算法分布式实现+MapReduce通俗讲解
Hadoop平台K-Means聚类算法分布式实现+MapReduce通俗讲解 在Hadoop分布式环境下实现K-Means聚类算法的伪代码如下: 输入:参数0--存储样本数据的文本文件inpu ...
- Hadoop、Zookeeper、Hbase分布式安装教程
参考: Hadoop安装教程_伪分布式配置_CentOS6.4/Hadoop2.6.0 Hadoop集群安装配置教程_Hadoop2.6.0_Ubuntu/CentOS ZooKeeper-3.3 ...
- Hadoop 在windows 上伪分布式的安装过程
第一部分:Hadoop 在windows 上伪分布式的安装过程 安装JDK 1.下载JDK http://www.oracle.com/technetwork/java/javaee/d ...
- Hadoop 2.4.0完全分布式平台搭建、配置、安装
一:系统安装与配置 Hadoop选择下载2.4.0 http://hadoop.apache.org / http://mirror.bit.edu.cn/apache/hadoop/common/h ...
随机推荐
- Mac python 2.X 升级到 3.X
Mac OS X10.9默认带了Python2.7,不过现在Python3.3.3出来了,如果想使用最新版本,赶紧升级下吧.基本步骤如下. 第1步:下载Python3.3 下载地址如下: Python ...
- js实现网站首页分享滑块
<!DOCTYPE html> <html lang="zh"> <head> <meta charset="UTF-8&quo ...
- 深入理解char * ,char ** ,char a[ ] ,char *a[]
1.数组的本质 数组是多个元素的集合,在内存中分布在地址相连的单元中,所以可以通过其下标访问不同单元的元素. 2.指针 指针也是一种变量,只不过它的内存单元中保存的是一个标识其他位置的地址.由于地址也 ...
- _pvp_gap_aura
该表可以用于控制区域内平衡,举个例子,当一个区域内超过limitHP的LM玩家个数为10,部落玩家个数为8,则阵营人数差为2,人数少的部落,所有部落玩家获得2层aura光环 zone 区域ID aur ...
- Spring boot2.0 设置文件上传大小限制
今天把Spring boot版本升级到了2.0后,发现原来的文件上传大小限制设置不起作用了,原来的application.properties设置如下: spring.http.multipart.m ...
- jsp技术和el表达式和jstl技术
注:本文参考黑马视频的讲义 jsp技术 1.jsp脚本 )<%java代码%> ----- 内部的java代码翻译到service方法的内部 )<%=java变量或表达式> - ...
- hashtable详解
hashtable也比称作哈希表,键值对或者关联数组 1. 先引用using System.Collections;命名空间 用于处理和表现key/value的键值对,其中key通常用来快速查找,同时 ...
- C++.构造函数(超出范围)_01
环境:Win7x64.Qt5.3.2 MSVC2010 OpenGL.vs2010 1.ZC:在 构造函数 中,基类访问子类的成员 会报内存错误,如果访问的是 基本类型的话(如int) 可能还不会出错 ...
- eclipse创建web项目web.xml配置文件笔记
1.使用eclipse创建web项目时,如果直接finish的话就没有默认生成web.xml配置文件,此时在你的项目下是看不到web.xml配置文件的,如果要查看的话可以如下操作: 右键你的项目,然后 ...
- leecode第八十九题(格雷编码)
class Solution { public: vector<int> grayCode(int n) { vector<int> res; res.push_back(); ...