Linux下搭建Hadoop具体步骤
装好虚拟机+Linux。而且主机网络和虚拟机网络互通。
以及Linux上装好JDK
1:在Linux下输入命令vi /etc/profile 加入HADOOP_HOME
export JAVA_HOME=/home/hadoop/export/jdk
export HADOOP_HOME=/home/hadoop/export/hadoop
export PATH=.:$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin
2:改动hadoop/conf文件夹以下hadoop-env.sh第九行
export JAVA_HOME=/home/hadoop/export/jdk
3:改动hadoop/conf文件夹以下core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/.../tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://127.0.0.1:9000</value>
</property>
</configuration>
4:改动hadoop/conf文件夹以下hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
5:改动hadoop/conf文件夹以下mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>127.0.0.1:9001</value>
</property>
</configuration>
改动完毕。
转到hadoop/bin以下输入hadoop namenode -format
出现例如以下:(说明成功)
Warning: $HADOOP_HOME is deprecated. 14/07/15 16:06:27 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = ubuntu/127.0.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 1.2.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG: java = 1.7.0_55
************************************************************/ 14/07/15 16:07:09 INFO util.GSet: Computing capacity for map BlocksMap
14/07/15 16:07:09 INFO util.GSet: VM type = 32-bit
14/07/15 16:07:09 INFO util.GSet: 2.0% max memory = 1013645312
14/07/15 16:07:09 INFO util.GSet: capacity = 2^22 = 4194304 entries
14/07/15 16:07:09 INFO util.GSet: recommended=4194304, actual=4194304
14/07/15 16:07:10 INFO namenode.FSNamesystem: fsOwner=hadoop
14/07/15 16:07:10 INFO namenode.FSNamesystem: supergroup=supergroup
14/07/15 16:07:10 INFO namenode.FSNamesystem: isPermissionEnabled=true
14/07/15 16:07:10 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
14/07/15 16:07:10 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
14/07/15 16:07:10 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0
14/07/15 16:07:10 INFO namenode.NameNode: Caching file names occuring more than 10 times
14/07/15 16:07:10 INFO common.Storage: Image file /home/hadoop/tmp/dfs/name/current/fsimage of size 118 bytes saved in 0 seconds.
14/07/15 16:07:10 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/home/hadoop/tmp/dfs/name/current/edits
14/07/15 16:07:10 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/home/hadoop/tmp/dfs/name/current/edits
14/07/15 16:07:10 INFO common.Storage: Storage directory /home/hadoop/tmp/dfs/name has been successfully formatted.
14/07/15 16:07:10 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ubuntu/127.0.1.1
************************************************************/
在这一部分中有一部分人会出现失败的情况。可是你一定要去查hadoop以下logs里面的输出异常非常具体。
第一次失败一定要记住删掉tmp以下的输出。由于有可能会出现不兼容的情况。
然后输入start-all.sh
Warning: $HADOOP_HOME is deprecated. starting namenode, logging to /home/hadoop/export/hadoop/libexec/../logs/hadoop-hadoop-namenode-ubuntu.out
localhost: starting datanode, logging to /home/hadoop/export/hadoop/libexec/../logs/hadoop-hadoop-datanode-ubuntu.out
localhost: starting secondarynamenode, logging to /home/hadoop/export/hadoop/libexec/../logs/hadoop-hadoop-secondarynamenode-ubuntu.out
starting jobtracker, logging to /home/hadoop/export/hadoop/libexec/../logs/hadoop-hadoop-jobtracker-ubuntu.out
localhost: starting tasktracker, logging to /home/hadoop/export/hadoop/libexec/../logs/hadoop-hadoop-tasktracker-ubuntu.out
在上面的过程中可能会提示你输入password,这时你能够设置个ssh免password登陆,我博客里面有。
输入jps 出现例如以下:(少一个datanode。这里我有益设置一个错误)
10666 NameNode
11547 Jps
11445 TaskTracker
11130 SecondaryNameNode
11218 JobTracker
查看logs
2014-07-15 16:13:43,032 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2014-07-15 16:13:43,094 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2014-07-15 16:13:43,098 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2014-07-15 16:13:43,118 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2014-07-15 16:13:43,999 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2014-07-15 16:13:44,044 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2014-07-15 16:13:45,484 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /home/hadoop/tmp/dfs/data: namenode namespaceID = 224603228; datanode namespaceID = 566757162
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:232)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:147)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:414)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:321)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1712)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1651)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1669)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1795)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1812)
这时你仅仅要删除tmp下的文件,问题解决。
然后你能够运行一个实例:详细操作例如以下
hadoop@ubuntu:~/export/hadoop$ ls
bin hadoop-ant-1.2.1.jar ivy README.txt
build.xml hadoop-client-1.2.1.jar ivy.xml sbin
c++ hadoop-core-1.2.1.jar lib share
CHANGES.txt hadoop-examples-1.2.1.jar libexec src
conf hadoop-minicluster-1.2.1.jar LICENSE.txt webapps
contrib hadoop-test-1.2.1.jar logs
docs hadoop-tools-1.2.1.jar NOTICE.txt
进行上传hdfs文件操作
hadoop@ubuntu:~/export/hadoop$ hadoop fs -put README.txt /
Warning: $HADOOP_HOME is deprecated.
如上说明上传成功。
运行一段wordcount程序(进行对README.txt文件处理)
hadoop@ubuntu:~/export/hadoop$ hadoop jar hadoop-examples-1.2.1.jar word
count /README.txt /wordcountoutput
Warning: $HADOOP_HOME is deprecated. 14/07/15 15:23:01 INFO input.FileInputFormat: Total input paths to process : 1
14/07/15 15:23:01 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/07/15 15:23:01 WARN snappy.LoadSnappy: Snappy native library not loaded
14/07/15 15:23:02 INFO mapred.JobClient: Running job: job_201407141636_0001
14/07/15 15:23:03 INFO mapred.JobClient: map 0% reduce 0%
14/07/15 15:23:15 INFO mapred.JobClient: map 100% reduce 0%
14/07/15 15:23:30 INFO mapred.JobClient: map 100% reduce 100%
14/07/15 15:23:32 INFO mapred.JobClient: Job complete: job_201407141636_0001
14/07/15 15:23:32 INFO mapred.JobClient: Counters: 29
14/07/15 15:23:32 INFO mapred.JobClient: Job Counters
14/07/15 15:23:32 INFO mapred.JobClient: Launched reduce tasks=1
14/07/15 15:23:32 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=12563
14/07/15 15:23:32 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
14/07/15 15:23:32 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
14/07/15 15:23:32 INFO mapred.JobClient: Launched map tasks=1
14/07/15 15:23:32 INFO mapred.JobClient: Data-local map tasks=1
14/07/15 15:23:32 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=14550
14/07/15 15:23:32 INFO mapred.JobClient: File Output Format Counters
14/07/15 15:23:32 INFO mapred.JobClient: Bytes Written=1306
14/07/15 15:23:32 INFO mapred.JobClient: FileSystemCounters
14/07/15 15:23:32 INFO mapred.JobClient: FILE_BYTES_READ=1836
14/07/15 15:23:32 INFO mapred.JobClient: HDFS_BYTES_READ=1463
14/07/15 15:23:32 INFO mapred.JobClient: FILE_BYTES_WRITTEN=120839
14/07/15 15:23:32 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=1306
14/07/15 15:23:32 INFO mapred.JobClient: File Input Format Counters
14/07/15 15:23:32 INFO mapred.JobClient: Bytes Read=1366
14/07/15 15:23:32 INFO mapred.JobClient: Map-Reduce Framework
14/07/15 15:23:32 INFO mapred.JobClient: Map output materialized bytes=1836
14/07/15 15:23:32 INFO mapred.JobClient: Map input records=31
14/07/15 15:23:32 INFO mapred.JobClient: Reduce shuffle bytes=1836
14/07/15 15:23:32 INFO mapred.JobClient: Spilled Records=262
14/07/15 15:23:32 INFO mapred.JobClient: Map output bytes=2055
14/07/15 15:23:32 INFO mapred.JobClient: Total committed heap usage (bytes)=212611072
14/07/15 15:23:32 INFO mapred.JobClient: CPU time spent (ms)=2430
14/07/15 15:23:32 INFO mapred.JobClient: Combine input records=179
14/07/15 15:23:32 INFO mapred.JobClient: SPLIT_RAW_BYTES=97
14/07/15 15:23:32 INFO mapred.JobClient: Reduce input records=131
14/07/15 15:23:32 INFO mapred.JobClient: Reduce input groups=131
14/07/15 15:23:32 INFO mapred.JobClient: Combine output records=131
14/07/15 15:23:32 INFO mapred.JobClient: Physical memory (bytes) snapshot=177545216
14/07/15 15:23:32 INFO mapred.JobClient: Reduce output records=131
14/07/15 15:23:32 INFO mapred.JobClient: Virtual memory (bytes) snapshot=695681024
14/07/15 15:23:32 INFO mapred.JobClient: Map output records=179
hadoop@ubuntu:~/export/hadoop$ hadoop fs -ls /
Warning: $HADOOP_HOME is deprecated. Found 3 items
-rw-r--r-- 1 hadoop supergroup 1366 2014-07-15 15:21 /README.txt
drwxr-xr-x - hadoop supergroup 0 2014-07-14 16:36 /home
drwxr-xr-x - hadoop supergroup 0 2014-07-15 15:23 /wordcountoutput
hadoop@ubuntu:~/export/hadoop$ hadoop fs -get /wordcountoutput /home/hadoop/
Warning: $HADOOP_HOME is deprecated.
你能够下载下来看看这个文件
例如以下:
(see 1
5D002.C.1, 1
740.13) 1
<http://www.wassenaar.org/> 1
Administration 1
Apache 1
BEFORE 1
BIS 1
Bureau 1
Commerce, 1
Commodity 1
Control 1
Core 1
Department 1
ENC 1
Exception 1
Export 2
For 1
Foundation 1
Government 1
Hadoop 1
Hadoop, 1
Industry 1
Jetty 1
License 1
Number 1
Regulations, 1
SSL 1
Section 1
Security 1
See 1
Software 2
Technology 1
The 4
This 1
U.S. 1
Unrestricted 1
about 1
algorithms. 1
and 6
and/or 1
another 1
any 1
as 1
asymmetric 1
at: 2
both 1
by 1
check 1
classified 1
code 1
code. 1
concerning 1
country 1
country's 1
country, 1
cryptographic 3
currently 1
details 1
distribution 2
eligible 1
encryption 3
exception 1
export 1
following 1
for 3
form 1
from 1
functions 1
has 1
have 1
Linux下搭建Hadoop具体步骤的更多相关文章
- Linux 下搭建 Hadoop 环境
Linux 下搭建 Hadoop 环境 作者:Grey 原文地址: 博客园:Linux 下搭建 Hadoop 环境 CSDN:Linux 下搭建 Hadoop 环境 环境要求 操作系统:CentOS ...
- Linux下搭建Hadoop集群
本文地址: 1.前言 本文描述的是如何使用3台Hadoop节点搭建一个集群.本文中,使用的是三个Ubuntu虚拟机,并没有使用三台物理机.在使用物理机搭建Hadoop集群的时候,也可以参考本文.首先这 ...
- Linux下搭建hadoop开发环境-超详细
先决条件:开发机器需要联网 已安装java 已安装Desktop组 1.上传安装软件到linux上: 2.安装maven,用于管理项目依赖包:以hadoop用户安装apache-maven-3.0.5 ...
- Linux下搭建Hadoop集群(Centos7.0)
Hadoop集群安装 概述 集群 cluster,将很多任务进程分布到多台计算机上:通过联合使用多台计算机的存储.计算能力完成更庞大的任务.为了实现无限量的存储和计算能力,在生产环境中必须使用集群来满 ...
- 在Linux下搭建Git服务器步骤
环境: 服务器 CentOS6.6 + git(version 1.7.1) 客户端 Windows10 + git(version 2.8.4.windows.1) ① 安装 Git Linux ...
- Linux 下搭建 HBase 环境
Linux 下搭建 HBase 环境 作者:Grey 原文地址: 博客园:Linux 下搭建 HBase 环境 CSDN:Linux 下搭建 HBase 环境 前置工作 首先,需要先完成 Linux ...
- Linux 下搭建 Hive 环境
Linux 下搭建 Hive 环境 作者:Grey 原文地址: 博客园:Linux 下搭建 Hive 环境 CSDN:Linux 下搭建 Hive 环境 前置工作 首先,需要先完成 Linux 下搭建 ...
- Linux下搭建个人网站
前不久在阿里买了一个服务器,然后开始第一次尝试搭建自己的个人网站.前端采用了bootstrap框架,后端采用的是PHP,数据库使用的是Mysql.新手第一次在linux下搭建遇见很多问题,在这里分享一 ...
- Linux下搭建PHP环境
转载于: http://www.uxtribe.com/php/405.html 该站下有系列PHP文章. 在Linux下搭建PHP环境比Windows下要复杂得多.除了安装Apache,PHP等软件 ...
随机推荐
- Maven2的配置文件settings.xml(转)
http://maven.apache.org/settings.html简介: 概览当Maven运行过程中的各种配置,例如pom.xml,不想绑定到一个固定的project或者要分配给用户时,我们使 ...
- SQL Server 2008 还原数据库
1.得到数据库备份文件,怎么得到的,[能够看这里]~ 2.把备份文件加个.bak 的后缀,比如: 3.打开SQL , 你能够新建一个空数据库 , 或者利用原有的数据库 , 点击右键>>任务 ...
- GitHub的问题
出现failed to publish the branch, 转自:http://blog.csdn.net/cucmakeit/article/details/29407329 (windows系 ...
- django中上传图片的写法
view参数 @csrf_exemptdef before_upload_avatar(request): before = True return render_to_response( ...
- ORACLE存储过程笔记1
ORACLE存储过程笔记1 一.基本语法(以及与informix的比较) create [or replace] procedure procedure_name (varible {IN|OUT ...
- Spring IOC三种注入方式(接口注入、setter注入、构造器注入)(摘抄)
IOC ,全称 (Inverse Of Control) ,中文意思为:控制反转, Spring 框架的核心基于控制反转原理. 什么是控制反转?控制反转是一种将组件依赖关系的创建和管理置于程序外部的技 ...
- BON取代半岛电视,美国人要“换口味”了吗?
记得很久以前唐骏在某高校演讲时,讲了这么一个笑话,他问一位美国最普通的大妈,“请你说出三个印象最深刻的中国城市”,在北京奥运会之前,这位大妈说了如下三个城市:北京.香港.新加坡.很显然,这位大 ...
- Android创建与读取Excel
主流的操作Excel的有两种方法,一种是通过poi包,另一种是通过jxl包.这里我主要讲解通过jxl包来读写Excel. 首先需要导入一个jxl.jar包. 下载地址:http://www.andyk ...
- HDOJ 2120 并查集
并查集的应用,用来查找被分割的区域个数. 即当两个节点值相同时说明已经为了一个圈,否则不可能,此时区域个数加1. #include<iostream> #include<cstdio ...
- Qt持久性对象进行序列化(同时比较了MFC与Java的方法)
Mfc和Java中自定义类的对象都可以对其进行持久性保存,Qt持久性对象进行序列化当然也是必不可少的.不过这个问题还真困扰了我很长时间……Mfc通过重写虚函数Serialize().Java则是所属的 ...