Install Hadoop 2.2.0 on Ubuntu Linux 13.04 (Single-Node Cluster)

This tutorial explains how to install Hadoop 2.2.0/2.3.0/2.4.0/2.4.1 on Ubuntu 13.04/13.10/14.04 (Single-Node Cluster). This setup does not require an additional user for
Hadoop. All files related to Hadoop will be stored inside the ~/hadoop directory.

  • Install a JRE. If you want the Oracle JRE, follow this post.
  • Install SSH:sudo
    apt-get install openssh-server
    Generate a SSH key:ssh-keygen
    -t rsa -P ""
    Enable SSH key:cat
    $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
    (Optional) Disable SSH login from remote addresses by setting in /etc/ssh/sshd_config:ListenAddress
    127.0.0.1
    Test local connection:ssh
    localhost
    If Ok, then exit:exitOtherwise
    debug 
  • Download Hadoop 2.2.0 (or newer versions)
  • Unpack, rename and move to the home directory:tar
    xvf hadoop-2.2.0.tar.gz
    mv
    hadoop-2.2.0 ~/hadoop
  • Create HDFS directory:mkdir
    -p ~/hadoop/data/namenode
    mkdir
    -p ~/hadoop/data/datanode
  • In file ~/hadoop/etc/hadoop/hadoop-env.sh insert (after the comment "The java implementation to use."):export
    JAVA_HOME="`dirname $(readlink /etc/alternatives/java)`/../"export HADOOP_COMMON_LIB_NATIVE_DIR="~/hadoop/lib"export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=~/hadoop/lib"
  • In file ~/hadoop/etc/hadoop/core-site.xml (inside <configuration> tag):<property>
    <name>fs.default.name</name> <value>hdfs://localhost:9000</value></property>
  • In file ~/hadoop/etc/hadoop/hdfs-site.xml (inside <configuration> tag):<property>
    <name>dfs.replication</name> <value>1</value></property><property> <name>dfs.namenode.name.dir</name> <value>${user.home}/hadoop/data/namenode</value></property><property> <name>dfs.datanode.data.dir</name> <value>${user.home}/hadoop/data/datanode</value></property>
  • In file ~/hadoop/etc/hadoop/yarn-site.xml (inside <configuration> tag):<property>
    <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value></property><property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value></property>
  • Create file ~/hadoop/etc/hadoop/mapred-site.xml:cp
    ~/hadoop/etc/hadoop/mapred-site.xml.template ~/hadoop/etc/hadoop/mapred-site.xml
    And insert (inside <configuration> tag):<property>
    <name>mapreduce.framework.name</name> <value>yarn</value></property>
  • Add Hadoop binaries to PATH:echo
    "export PATH=$PATH:~/hadoop/bin:~/hadoop/sbin" >> ~/.bashrc
    source
    ~/.bashrc
  • Format HDFS:hdfs
    namenode -format
  • Start Hadoop:start-dfs.sh
    && start-yarn.sh
    If you get the warning:

    WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

    It is because you are running on 64bit but Hadoop native library is 32bit. This is not a big issue. If you want (optional) to fix it, check this.

  • Check status:jpsExpected
    output (PIDs may change!):10969
    DataNode11745 NodeManager11292 SecondaryNameNode10708 NameNode11483 ResourceManager13096 Jps
    N.B. The old JobTracker has been replaced by the ResourceManager.
  • Access web interfaces:
    • Cluster status: http://localhost:8088
    • HDFS status: http://localhost:50070
    • Secondary NameNode status: http://localhost:50090
  • Test Hadoop:hadoop
    jar ~/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 20 -fileSize 10
    Check the results and remove files:hadoop
    jar ~/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar TestDFSIO -clean
    And:hadoop
    jar ~/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 2 5
  • Stop hadoop:stop-dfs.sh
    && stop-yarn.sh

Some of these steps are taken from this tutorial.

Ubuntu上的Hadoop安装教程的更多相关文章

  1. Ubuntu上搭建Hadoop环境(单机模式+伪分布模式) (转载)

    Hadoop在处理海量数据分析方面具有独天优势.今天花了在自己的Linux上搭建了伪分布模式,期间经历很多曲折,现在将经验总结如下. 首先,了解Hadoop的三种安装模式: 1. 单机模式. 单机模式 ...

  2. Hadoop安装教程_单机/伪分布式配置_CentOS6.4/Hadoop2.6.0

    Hadoop安装教程_单机/伪分布式配置_CentOS6.4/Hadoop2.6.0 环境 本教程使用 CentOS 6.4 32位 作为系统环境,请自行安装系统.如果用的是 Ubuntu 系统,请查 ...

  3. Ubuntu上搭建Hadoop环境(单机模式+伪分布模式)

    首先要了解一下Hadoop的运行模式: 单机模式(standalone)        单机模式是Hadoop的默认模式.当首次解压Hadoop的源码包时,Hadoop无法了解硬件安装环境,便保守地选 ...

  4. Ubuntu上搭建Hadoop环境(单机模式+伪分布模式)【转】

    [转自:]http://blog.csdn.net/hitwengqi/article/details/8008203 最近一直在自学Hadoop,今天花点时间搭建一个开发环境,并整理成文. 首先要了 ...

  5. hadoop安装教程,分布式配置 CentOS7 Hadoop3.1.2

    安装前的准备 1. 准备4台机器.或虚拟机 4台机器的名称和IP对应如下 master:192.168.199.128 slave1:192.168.199.129 slave2:192.168.19 ...

  6. Hadoop安装教程_伪分布式

    文章更新于:2020-04-09 注1:hadoop 的安装及单机配置参见:Hadoop安装教程_单机(含Java.ssh安装配置) 注2:hadoop 的完全分布式配置参见:Hadoop安装教程_分 ...

  7. ubuntu 14.04 lts安装教程:u盘安装ubuntu 14.04 lts步骤

    绿茶小编带来了ubuntu 14.04 lts安装教程,下文详细讲解了u盘安装ubuntu 14.04 lts的步骤,很简单,只需要一个工具就能够轻松使用u盘安装ubuntukylin 14.04系统 ...

  8. 一文彻底解决Ubuntu上PHP的安装以及版本切换

    Ubuntu上官方的源,比如 Ubuntu14.04 默认源中的是 PHP5.6.x.Ubuntu16.04 默认源中的是 PHP7.0.x,那么如果想在 Ubuntu16.04 上安装 PHP7.1 ...

  9. 转载:Hadoop安装教程_单机/伪分布式配置_Hadoop2.6.0/Ubuntu14.04

    原文 http://www.powerxing.com/install-hadoop/ 当开始着手实践 Hadoop 时,安装 Hadoop 往往会成为新手的一道门槛.尽管安装其实很简单,书上有写到, ...

随机推荐

  1. Number of Airplanes in the Sky

    Given an interval list which are flying and landing time of the flight. How many airplanes are on th ...

  2. js自定制周期函数

    function mySetInterval(fn, milliSec,count){ function interval(){ if(typeof count==='undefined'||coun ...

  3. springboot:mybatis多数据源配置

    1.application.properties #CMS数据源(主库) spring.datasource.cms.driver-class-name=com.mysql.jdbc.Driver s ...

  4. SecureCRT中常用linux命令 -《转载》

    常用命令: 一.ls 只列出文件名 (相当于dir,dir也可以使用) -A:列出所有文件,包含隐藏文 件. -l:列表形式,包含文件的绝大部分属性. -R:递归显示. --help:此命令的帮助. ...

  5. java 嵌套接口

    接口可以嵌套在其它类或接口中,可以拥有public和"包访问权限"两种可见性 作为一种新添加的方式,接口也可以实现为private 当实现某个接口时,并不需要实现嵌套在其内的任何接 ...

  6. java基础学习总结——GUI编程(二)

    一.事件监听

  7. JS两种事件的触发方式

    一.入侵式触发方式 <input type="button" id="one" onclick="事件" /> 二.非入侵式触发 ...

  8. 第二届CCF软件能力认证

    1. 相邻数对 问题描述 给定n个不同的整数,问这些数中有多少对整数,它们的值正好相差1. 输入格式 输入的第一行包含一个整数n,表示给定整数的个数. 第二行包含所给定的n个整数. 输出格式 输出一个 ...

  9. django中的view测试和models测试样例

    感觉用model_mommy比factory_boy要好些. 如果Models.py如下: from django.db import models from django.contrib.auth. ...

  10. linux convert mp3 to wav and opus to wav

    link :  https://www.cyberciti.biz/faq/convert-mp3-files-to-wav-files-in-linux/ Install mpg321 or mpg ...