单机/伪分布式Hadoop2.4.1安装文档 2014-07-08 21:16 2275人阅读 评论(0) 收藏
转载自官方文档,最新版请见:http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/SingleCluster.html
补充:建议添加如下环境变量
#hadoop configuration
export PATH=$PATH:/home/jediael/hadoop-2.4.1/bin:/home/jediael/hadoop-2.4.1/sbin
export HADOOP_HOME=/home/jediael/hadoop-2.4.1
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
Purpose
This document describes how to set up and configure a single-node Hadoop installation so that you can quickly perform simple operations using Hadoop MapReduce and the Hadoop Distributed File System (HDFS).
Prerequisites
Supported Platforms
- GNU/Linux is supported as a development and production platform. Hadoop has been demonstrated on GNU/Linux clusters with 2000 nodes.
- Windows is also a supported platform but the followings steps are for Linux only. To set up Hadoop on Windows, see wiki page.
Required Software
Required software for Linux include:
- Java™ must be installed. Recommended Java versions are described at HadoopJavaVersions.
- ssh must be installed and sshd must be running to use the Hadoop scripts that manage remote Hadoop daemons.
Installing Software
If your cluster doesn't have the requisite software you will need to install it.
For example on Ubuntu Linux:
$ sudo apt-get install ssh
$ sudo apt-get install rsync
Download
To get a Hadoop distribution, download a recent stable release from one of the Apache Download Mirrors.
Prepare to Start the Hadoop Cluster
Unpack the downloaded Hadoop distribution. In the distribution, edit the file etc/hadoop/hadoop-env.sh to define some parameters as follows:
# set to the root of your Java installation
export JAVA_HOME=/usr/java/latest # Assuming your installation directory is /usr/local/hadoop
export HADOOP_PREFIX=/usr/local/hadoop
第二步不做好像没影响。
Try the following command:
$ bin/hadoop
This will display the usage documentation for the hadoop script.
Now you are ready to start your Hadoop cluster in one of the three supported modes:
Standalone Operation
By default, Hadoop is configured to run in a non-distributed mode, as a single Java process. This is useful for debugging.
The following example copies the unpacked conf directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory.
$ mkdir input
$ cp etc/hadoop/*.xml input
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar grep input output 'dfs[a-z.]+'
$ cat output/*
Pseudo-Distributed Operation
Hadoop can also be run on a single-node in a pseudo-distributed mode where each Hadoop daemon runs in a separate Java process.
Configuration
Use the following:
etc/hadoop/core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
etc/hadoop/hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
Setup passphraseless ssh
Now check that you can ssh to the localhost without a passphrase:
$ ssh localhost
If you cannot ssh to localhost without a passphrase, execute the following commands:
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Execution
The following instructions are to run a MapReduce job locally. If you want to execute a job on YARN, see YARN on Single Node.
- Format the filesystem:
$ bin/hdfs namenode -format
- Start NameNode daemon and DataNode daemon:此步骤报很多警告,但不影响执行结果。
$ sbin/start-dfs.sh
The hadoop daemon log output is written to the $HADOOP_LOG_DIR directory (defaults to $HADOOP_HOME/logs).
- Browse the web interface for the NameNode; by default it is available at:
- NameNode - http://localhost:50070/
- Make the HDFS directories required to execute MapReduce jobs:
$ bin/hdfs dfs -mkdir /user
$ bin/hdfs dfs -mkdir /user/<username> - Copy the input files into the distributed filesystem:
$ bin/hdfs dfs -put etc/hadoop input
- Run some of the examples provided:
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar grep input output 'dfs[a-z.]+'
- Examine the output files:
Copy the output files from the distributed filesystem to the local filesystem and examine them:
$ bin/hdfs dfs -get output output
$ cat output/*or
View the output files on the distributed filesystem:
$ bin/hdfs dfs -cat output/*
- When you're done, stop the daemons with:
$ sbin/stop-dfs.sh
YARN on Single Node
You can run a MapReduce job on YARN in a pseudo-distributed mode by setting a few parameters and running ResourceManager daemon and NodeManager daemon in addition.
The following instructions assume that 1. ~ 4. steps of the above instructions are already executed.
- Configure parameters as follows:
etc/hadoop/mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>etc/hadoop/yarn-site.xml:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration> - Start ResourceManager daemon and NodeManager daemon:
$ sbin/start-yarn.sh
- Browse the web interface for the ResourceManager; by default it is available at:
- ResourceManager - http://localhost:8088/
- Run a MapReduce job.
- When you're done, stop the daemons with:
$ sbin/stop-yarn.sh
单机/伪分布式Hadoop2.4.1安装文档 2014-07-08 21:16 2275人阅读 评论(0) 收藏的更多相关文章
- 单机/伪分布式Hadoop2.4.1安装文档
转载自官方文档,最新版请见:http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/SingleCluster.h ...
- opnet安装及安装中出现问题的解决办法 分类: opnet 2014-04-06 21:50 397人阅读 评论(0) 收藏
我使用的opnet14.5 win7 64位系统的http://pan.baidu.com/s/1qWyfxnu,电脑先刷了win7 64位原版系统. 选择了VS2013+opnet14.5的安装方 ...
- CocoaPods安装和使用教程 分类: ios技术 ios相关 2015-03-11 21:53 48人阅读 评论(0) 收藏
目录 CocoaPods是什么? 如何下载和安装CocoaPods? 如何使用CocoaPods? 场景1:利用CocoaPods,在项目中导入AFNetworking类库 场景2:如何正确编译运行一 ...
- VS2010下安装Opencv 分类: Opencv 2014-11-02 13:51 778人阅读 评论(0) 收藏
Opencv作为一种跨平台计算机视觉库,在图像处理领域得到广泛的应用,下面介绍如何在VS2010中安装配置Opencv 一.下载Opencv 下载网址:http://sourceforge.net/p ...
- JAVA代码规范 标签: java文档工作 2016-06-12 21:50 277人阅读 评论(5) 收藏
开始做java的ITOO了,近期的工作内容就是按照代码规范来改自己负责的代码,之前做机房收费系统的时候,也是经常验收的,甚至于我们上次验收的时候,老师也去了.对于我们的代码规范,老师其实是很重视的,他 ...
- 安装hadoop2.6.0伪分布式环境 分类: A1_HADOOP 2015-04-27 18:59 409人阅读 评论(0) 收藏
集群环境搭建请见:http://blog.csdn.net/jediael_lu/article/details/45145767 一.环境准备 1.安装linux.jdk 2.下载hadoop2.6 ...
- 安装spark1.3.1单机环境 分类: B8_SPARK 2015-04-27 14:52 1873人阅读 评论(0) 收藏
本文介绍安装spark单机环境的方法,可用于测试及开发.主要分成以下4部分: (1)环境准备 (2)安装scala (3)安装spark (4)验证安装情况 1.环境准备 (1)配套软件版本要求:Sp ...
- 搭建hadoop2.6.0集群环境 分类: A1_HADOOP 2015-04-20 07:21 459人阅读 评论(0) 收藏
一.规划 (一)硬件资源 10.171.29.191 master 10.171.94.155 slave1 10.251.0.197 slave3 (二)基本资料 用户: jediael 目录: ...
- Hadoop1.2.1伪分布模式安装指南 分类: A1_HADOOP 2014-08-17 10:52 1346人阅读 评论(0) 收藏
一.前置条件 1.操作系统准备 (1)Linux可以用作开发平台及产品平台. (2)win32只可用作开发平台,且需要cygwin的支持. 2.安装jdk 1.6或以上 3.安装ssh,并配置免密码登 ...
随机推荐
- Linux网络编程--字节序
1 .谈到字节序,那么会有朋友问什么是字节序 非常easy:[比如一个16位的整数.由2个字节组成,8位为一字节,有的系统会将高字节放在内存低的地址上,有的则将低字节放在内存高的地址上,所以存在字节序 ...
- 【Android Studio探索之路系列】之六:Android Studio加入依赖
作者:郭孝星 微博:郭孝星的新浪微博 邮箱:allenwells@163.com 博客:http://blog.csdn.net/allenwells github:https://github.co ...
- Android中关于JNI 的学习(零)简单的样例,简单地入门
Android中JNI的作用,就是让Java可以去调用由C/C++实现的代码,为了实现这个功能.须要用到Anrdoid提供的NDK工具包,在这里不讲怎样配置了,好麻烦,配置了好久. . . 本质上,J ...
- scroolspy滚动监听插件
<nav id="nav" class="navbar navbar-default"> <a href="#" clas ...
- 笔记四:onsubmit和onclick的区别
今天碰到关于表单提交的问题,我是用submit还是用onclick好呢,然后我去百度了一下两者的区别: onsubmit只能表单上使用,提交表单前会触发, onclick是按钮等控件使用, 用来触发点 ...
- Stack switching mechanism in a computer system
A method and mechanism for performing an unconditional stack switch in a processor. A processor incl ...
- 102.tcp实现多线程连接与群聊
协议之间的关系 socket在哪 socket是什么 Socket是应用层与TCP/IP协议族通信的中间软件抽象层,它是一组接口.在设计模式中,Socket其实就是一个门面模式,它把复杂的TCP/IP ...
- Appium_Java_API
1. driver.findElement(MobileBy.AndroidUIAutomator("邀请")).click();2. driver.findElementById ...
- 在linux环境下增加别名
编辑.cshrc文件:gvim ~/.cshrc 增加要添加的别名,例如:alias la 'ls -a' qw保存退出 source ~/.cshrc即可生效
- [Angular] Protect The Session Id with https and http only
For the whole signup process. we need to Hash the password to create a password digest Store the use ...