在 《Docker中搭建Hadoop-2.6单机伪分布式集群》中在容器中操作来搭建伪分布式的Hadoop集群,这一节中将主要通过Dokcerfile 来完成这项工作。

1 获取一个简单的Docker系统镜像,并建立一个容器。

   这里我选择下载CentOS镜像

  1. docker pull centos

  通过docker tag命令将下载的CentOS镜像名称换成centos,并删除老标签

  1. docker tag docker.io/centos centos
    docker rmr docker.io/centos

2. JDK的安装和配置

  去Oracle官网提前下载好所需的jdk。

  建立文件夹,并将jdk copy到文件夹下

  1. [root@centos-docker ~]# mkdir centos-jdk
  2. [root@centos-docker ~]# mv jdk-7u79-linux-x64.tar.gz ./centos-jdk/
  3. [root@centos-docker ~]# cd centos-jdk/
  4. [root@centos-docker centos-jdk]# ls
  5. jdk-7u79-linux-x64.tar.gz

  在centos-jdk文件夹中建立Dockerfile,其内容如下:

  1. # CentOS with JDK
  2. # Author amei
  3.  
  4. # build a new image with basic centos
  5. FROM centos
  6. # who is the author
  7. MAINTAINER amei
  8.  
  9. # make a new directory to store the jdk files
  10. RUN mkdir /usr/local/java
  11.  
  12. # copy the jdk archive to the image,and it will automaticlly unzip the tar file
  13. ADD jdk-7u79-linux-x64.tar.gz /usr/local/java/
  14.  
  15. # make a symbol link
  16. RUN ln -s /usr/local/java/jdk1..0_79 /usr/local/java/jdk
  17.  
  18. # set environment variables
  19. ENV JAVA_HOME /usr/local/java/jdk
  20. ENV JRE_HOME ${JAVA_HOME}/jre
  21. ENV CLASSPATH .:${JAVA_HOME}/lib:${JRE_HOME}/lib
  22. ENV PATH ${JAVA_HOME}/bin:$PATH

  根据Dokcerfile创建新镜像:

  1. # 注意后边的 . 不能忘了
    [root@centos-docker centos-jdk]# docker build -t="centos-jdk" .
  2. Sending build context to Docker daemon 153.5 MB
  3. Step : FROM centos
  4. ---> e8f1bdb3b6a7
  5. .....................................
  6. Step : ENV PATH ${JAVA_HOME}/bin:$PATH
  7. ---> Running in 5ecbe2fac774
  8. ---> ad1110b84433
  9. Removing intermediate container 5ecbe2fac774
  10. Successfully built ad1110b84433

  查看新建立的镜像

  1. [root@centos-docker centos-jdk]# docker images
  2. REPOSITORY TAG IMAGE ID CREATED SIZE
  3. centos-jdk latest ad1110b84433 minutes ago MB
  4. centos latest e8f1bdb3b6a7 weeks ago 196.7 MB

  建立容器,查看新的镜像中的JDK是否正确

  1. [root@centos-docker centos-jdk]# docker run -it centos-jdk /bin/bash
  2. [root@b665dbff9965 /]# java -version    # 出来结果表明配置没问题
  3. java version "1.7.0_79"
  4. Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
  5. Java HotSpot(TM) -Bit Server VM (build 24.79-b02, mixed mode)
  6. [root@b665dbff9965 /]# echo $JAVA_HOME
  7. /usr/local/java/jdk

3. 在前一步基础上安装ssh

  建立新的文件夹,并在其下建立Dokcerfile文件,其内容为:

  1. # build a new image with centos-jdk
    FROM centos-jdk
    # who is the author
    MAINTAINER amei
  2.  
  3. # install openssh
    RUN yum -y  install openssh-server openssh-clients
  4.  
  5. #generate key files
    RUN ssh-keygen -q -t rsa -b 2048 -f /etc/ssh/ssh_host_rsa_key -N ''
    RUN ssh-keygen -q -t ecdsa -f /etc/ssh/ssh_host_ecdsa_key -N ''
    RUN ssh-keygen -q -t dsa -f /etc/ssh/ssh_host_ed25519_key  -N ''
  6.  
  7. # login localhost without password
    RUN ssh-keygen -f /root/.ssh/id_rsa -N ''
    RUN touch /root/.ssh/authorized_keys
    RUN cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys
  8.  
  9. # set password of root
    RUN echo "root:1234" | chpasswd
  10.  
  11. # open the port 22
    EXPOSE 22
    # when start a container it will be executed
    CMD ["/usr/sbin/sshd","-D"]

  利用此Dockerfile 建立镜像:

  1. [root@centos-docker centos-jdk-ssh]# docker build -t "centos-jdk-ssh" .
  2. Sending build context to Docker daemon 2.56 kB
  3. Step : FROM centos-jdk
  4. ---> ad1110b84433
  5. 。。。。。。。。。。。。。。。。。。。。。。。。
  6. Successfully built 5286623a6cc0

  验证建立好的镜像:

  1. #在刚才的镜像之上建立容器
    [root@centos-docker centos-jdk-ssh]# docker run -it centos-jdk-ssh /bin/bash
  2. [root@118f3d29fc73 /]# /usr/sbin/sshd      #开启sshd服务
  3. [root@118f3d29fc73 /]# ssh root@localhost    #登陆到本机
  4. The authenticity of host 'localhost (::1)' can't be established.    # 观察确实不用密码即可登陆
  5. ECDSA key fingerprint is b7:f0:::c9:ca::8b::0d:::6f::4f:.
  6. Are you sure you want to continue connecting (yes/no)? yes
  7. Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
  8. [root@118f3d29fc73 ~]# exit    #退出容器
  9. logout
  10. Connection to localhost closed.

4. 安装Hdoop2.6

  首先先下载好hadoop安装包。

  建立文件夹,并在文件夹下建立如下几个文件。

  编辑core-site.xml文件

  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  3.  
  4. <configuration>
  5. <property>
  6. <name>hadoop.tmp.dir</name>
  7. <value>file:/data/hadoop/tmp</value>
  8. </property>
  9. <property>
  10. <name>fs.defaultFS</name>
  11. <value>hdfs://localhost:9000</value>
  12. </property>
  13. </configuration>

  编辑hdfs-site.xml文件

  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  3.  
  4. <configuration>
  5. <property>
  6. <name>dfs.replication</name>
  7. <value></value>
  8. </property>
  9. <property>
  10. <name>dfs.namenode.name.dir</name>
  11. <value>file:/data/hadoop/dfs/name</value>
  12. </property>
  13. <property>
  14. <name>dfs.datanode.data.dir</name>
  15. <value>file:/data/hadoop/dfs/data</value>
  16. </property>
  17. </configuration>

  在其下建立Dokcerfile文件,其内容为:

  1. # build a new image with  centos-jdk-ssh
    FROM centos-jdk-ssh
    # who is the author
    MAINTAINER amei
  2.  
  3. # install some important software
    RUN yum -y install net-tools  which
  4.  
  5. # copy the hadoop  archive to the image,and it will automaticlly unzip the tar file
    ADD hadoop-2.6.0.tar.gz /usr/local/
  6.  
  7. # make a symbol link
    RUN ln -s /usr/local/hadoop-2.6.0 /usr/local/hadoop
  8.  
  9. # copy the configuration file to image
    COPY core-site.xml /usr/local/hadoop/etc/hadoop/
    COPY hdfs-site.xml /usr/local/hadoop/etc/hadoop/
  10.  
  11. # change hadoop environment variables
    RUN sed -i "s?JAVA_HOME=\${JAVA_HOME}?JAVA_HOME=/usr/local/java/jdk?g" /usr/local/hadoop/etc/hadoop/hadoop-env.sh
  12.  
  13. # set environment variables
    ENV HADOOP_HOME /usr/local/hadoop
    ENV PATH ${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH

  此时文件夹下的文件有:

  1. [root@centos-docker centos-hadoop]# ll
  2. total 190704
  3. -rw-r--r--. 1 root root 403 Aug 7 06:52 core-site.xml
  4. -rw-r--r--. 1 root root 708 Aug 7 06:52 Dockerfile
  5. -rwxr-x---. 1 root root 195257604 Aug 7 04:44 hadoop-2.6.0.tar.gz
  6. -rw-r--r--. 1 root root 546 Aug 7 06:25 hdfs-site.xml

  建立镜像:

  1. docker build -t "centos-hadoop" .

  查看镜像:

  1. [root@centos-docker centos-hadoop]# docker images
  2. REPOSITORY TAG IMAGE ID CREATED SIZE
  3. centos-hadoop latest 64b9d221973b minutes ago MB
  4. centos-jdk-ssh latest 5286623a6cc0 About an hour ago MB
  5. centos-jdk latest ad1110b84433 hours ago MB

  建立容器测试镜像:

  1. [root@centos-docker centos-hadoop]# docker run -it centos-hadoop /bin/bash #开启容器
  2. [root@889d94ef9cbc /]#/usr/sbin/sshd            #开启sshd服务
  3. [root@889d94ef9cbc /]# hdfs namenode -format        #格式化namenode
  4. // :: INFO namenode.NameNode: STARTUP_MSG:
  5. /************************************************************
  6. STARTUP_MSG: Starting NameNode
  7. STARTUP_MSG: host = 889d94ef9cbc/172.17.0.2
  8. STARTUP_MSG: args = [-format]
  9. STARTUP_MSG: version = 2.6.0
  10. ............................................................
  11. 16/08/06 22:56:36 INFO common.Storage: Storage directory /data/hadoop/dfs/name has been successfully formatted.
  12. 16/08/06 22:56:37 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
  13. 16/08/06 22:56:37 INFO util.ExitUtil: Exiting with status 0
  14. 16/08/06 22:56:37 INFO namenode.NameNode: SHUTDOWN_MSG:
  15. /************************************************************
  16. SHUTDOWN_MSG: Shutting down NameNode at 889d94ef9cbc/172.17.0.2
  17. ************************************************************/
  18. [root@889d94ef9cbc /]# start-dfs.sh   # 开启hdfs
  19. [root@889d94ef9cbc /]# jps      #查看开启的应用程序
  20. SecondaryNameNode
  21. DataNode
  22. Jps
  23. NameNode
  24. [root@889d94ef9cbc /]# hadoop dfsadmin -report  #查看HDFS状况
  25. DEPRECATED: Use of this script to execute hdfs command is deprecated.
  26. Instead use the hdfs command for it.
  27.  
  28. Configured Capacity: (9.99 GB)
  29. Present Capacity: (9.08 GB)
  30. DFS Remaining: (9.08 GB)
  31. DFS Used: ( KB)
  32. DFS Used%: 0.00%
  33. Under replicated blocks:
  34. Blocks with corrupt replicas:
  35. Missing blocks:
  36.  
  37. -------------------------------------------------
  38. Live datanodes ():
  39.  
  40. Name: 127.0.0.1: (localhost)
  41. Hostname: 889d94ef9cbc
  42. Decommission Status : Normal
  43. Configured Capacity: (9.99 GB)
  44. DFS Used: ( KB)
  45. Non DFS Used: (933.54 MB)
  46. DFS Remaining: (9.08 GB)
  47. DFS Used%: 0.00%
  48. DFS Remaining%: 90.87%
  49. Configured Cache Capacity: ( B)
  50. Cache Used: ( B)
  51. Cache Remaining: ( B)
  52. Cache Used%: 100.00%
  53. Cache Remaining%: 0.00%
  54. Xceivers:
  55. Last contact: Sat Aug :: UTC

5. 将前边的步骤合在一起,用一个Dockerfile 来完成

  建立一个新的文件夹,文件夹要包含建立进行所需的资源。

  1. [root@centos-docker centos-hadoop]# ll
  2. total
  3. -rw-r--r--. root root Aug : core-site.xml  
  4. -rw-r--r--. root root Aug : Dockerfile
  5. -rwxr-x---. root root Aug : hadoop-2.6..tar.gz
  6. -rw-r--r--. root root Aug : hdfs-site.xml
  7. -rwxr-x---. root root Aug : jdk-7u79-linux-x64.tar.gz

  Dockerfile中的内容为:

  

  1. # build a new hadoop image with basic centos
  2. FROM centos
  3. # who is the author
  4. MAINTAINER amei
  5.  
  6. # install some important softwares
  7. RUN yum -y install openssh-server openssh-clients net-tools which
  8.  
  9. ####################Configurate JDK################################
  10. # make a new directory to store the jdk files
  11. RUN mkdir /usr/local/java
  12.  
  13. # copy the jdk archive to the image,and it will automaticlly unzip the tar file
  14. ADD jdk-7u79-linux-x64.tar.gz /usr/local/java/
  15.  
  16. # make a symbol link
  17. RUN ln -s /usr/local/java/jdk1..0_79 /usr/local/java/jdk
  18.  
  19. ###################Configurate SSH#################################
  20. #generate key files
  21. RUN ssh-keygen -q -t rsa -b -f /etc/ssh/ssh_host_rsa_key -N ''
  22. RUN ssh-keygen -q -t ecdsa -f /etc/ssh/ssh_host_ecdsa_key -N ''
  23. RUN ssh-keygen -q -t dsa -f /etc/ssh/ssh_host_ed25519_key -N ''
  24.  
  25. # login localhost without password
  26. RUN ssh-keygen -f /root/.ssh/id_rsa -N ''
  27. RUN cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys
  28.  
  29. ###################Configurate Hadoop##############################
  30. # copy the hadoop archive to the image,and it will automaticlly unzip the tar file
  31. ADD hadoop-2.6..tar.gz /usr/local/
  32.  
  33. # make a symbol link
  34. RUN ln -s /usr/local/hadoop-2.6. /usr/local/hadoop
  35.  
  36. # copy the configuration file to image
  37. COPY core-site.xml /usr/local/hadoop/etc/hadoop/
  38. COPY hdfs-site.xml /usr/local/hadoop/etc/hadoop/
  39.  
  40. # change hadoop environment variables
  41. RUN sed -i "s?JAVA_HOME=\${JAVA_HOME}?JAVA_HOME=/usr/local/java/jdk?g" /usr/local/hadoop/etc/hadoop/hadoop-env.sh
  42.  
  43. ################### Integration configuration #######################
  44. # set environment variables
  45. ENV JAVA_HOME /usr/local/java/jdk
  46. ENV JRE_HOME ${JAVA_HOME}/jre
  47. ENV CLASSPATH .:${JAVA_HOME}/lib:${JRE_HOME}/lib
  48. ENV HADOOP_HOME /usr/local/hadoop
  49. ENV PATH ${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:${JAVA_HOME}/bin:$PATH
  50.  
  51. # set password of root
  52. RUN echo "root:1234" | chpasswd
  53. # when start a container it will be executed
  54. CMD ["/usr/sbin/sshd"]

以此Dockerfile来建立Hadoop镜像

  1. docker build -t "centos-hadoop" .

6. 后话

Dockerfile和jdk,hadoop文件以及其它的配置文件都打包好放在百度云上,解压之后可以直接在目录中敲入命令  docker build -t "centos-hadoop" .  建立Hadoop镜像,不过前提是你得先有一个centos镜像。

  http://pan.baidu.com/s/1dE8NCo5

Dockerfile完成Hadoop2.6的伪分布式搭建的更多相关文章

  1. hadoop2.4.1伪分布式搭建

    1.准备Linux环境 1.0点击VMware快捷方式,右键打开文件所在位置 -> 双击vmnetcfg.exe -> VMnet1 host-only ->修改subnet ip ...

  2. hadoop2.4.0伪分布式搭建以及分布式关机重启后datanode没起来的解决办法

    1.准备Linux环境 1.0点击VMware快捷方式,右键打开文件所在位置 -> 双击vmnetcfg.exe -> VMnet1 host-only ->修改subnet ip ...

  3. hadoop2.2.0伪分布式搭建3--安装Hadoop

    3.1上传hadoop安装包 3.2解压hadoop安装包 mkdir /cloud #解压到/cloud/目录下 tar -zxvf hadoop-2.2.0.tar.gz -C /cloud/ 3 ...

  4. hadoop2.2.0伪分布式搭建

    1.准备Linux环境     1.0点击VMware快捷方式,右键打开文件所在位置 -> 双击vmnetcfg.exe -> VMnet1 host-only ->修改subnet ...

  5. Hadoop2.6.0伪分布式搭建

    环境: 1.Ubuntu14.04 首先要在linux系统上新建一个账户,比如就叫做hadoop,用于专门运行hadoop. 2.配置jdk 我是使用的版本是jdk1.8. 解压:创建/usr/jav ...

  6. hadoop2.2.0伪分布式搭建1--准备Linux环境

    1.0修改网关 点击VMware快捷方式,右键打开文件所在位置 -> 双击vmnetcfg.exe -> VMnet1 host-only ->修改subnet ip 设置网段:19 ...

  7. hadoop2.2.0伪分布式搭建2--安装JDK

    2.1上传FileZilla 上传 https://filezilla-project.org/ 2.2解压jdk #创建文件夹 mkdir /usr/java #解压 tar -zxvf jdk-7 ...

  8. Hadoop2.5.0伪分布式环境搭建

    本章主要介绍下在Linux系统下的Hadoop2.5.0伪分布式环境搭建步骤.首先要搭建Hadoop伪分布式环境,需要完成一些前置依赖工作,包括创建用户.安装JDK.关闭防火墙等. 一.创建hadoo ...

  9. 在Win7虚拟机下搭建Hadoop2.6.0伪分布式环境

    近几年大数据越来越火热.由于工作需要以及个人兴趣,最近开始学习大数据相关技术.学习过程中的一些经验教训希望能通过博文沉淀下来,与网友分享讨论,作为个人备忘. 第一篇,在win7虚拟机下搭建hadoop ...

随机推荐

  1. [Android Pro] AIDL进程间传递自定义类型参数

    1.创建.aidl 文件 AIDL 语法简单,用来声明接口,其中的方法接收参数和返回值,但是参数和返回值的类型是有约束的,且有些类型是需要 import,另外一些则无需这样做. AIDL 支持的数据类 ...

  2. !struct operator reload

    struct t3DObject //对象信息结构体{ int numOfVerts; // 模型中顶点的数目 int numOfFaces; // 模型中面的数目 int numTexVertex; ...

  3. XMPP框架下微信项目总结(3)获取点子名片信息(个人资料)更新电子名片

    思路:1 调用方法,添加点子名片模块(名片信息含电话,头像,单位个人信息)等 开启ps:APP发送请求到服务器openfire,服务器返回个人信息,app存储到数据库,app界面需要数据通过数据库获取 ...

  4. NYOJ题目65另一种阶乘问题

    aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAAsIAAAJhCAIAAADE+EHOAAAgAElEQVR4nO3drXLjSAMu4HMT4bmQYF

  5. js冒泡排序与二分法查找

    冒泡排序 var attr=[1,5,7,6,3,9,2,8,4]; var zj=0; //控制比较轮数 for(var i=0;i<attr.length-1;i++) { //控制每轮的比 ...

  6. Linux系统监控命令及如何定位到Java线程

    >>PID.TID的区分 uid是user id,即用户id,root用户的uid是0,0为最高权限,gid是group id,用户组id,使用 id 命令可以很简单的通过用户名查看UID ...

  7. devices-list

    转自:https://www.kernel.org/pub/linux/docs/lanana/device-list/devices-2.6.txt LINUX ALLOCATED DEVICES ...

  8. java中常用的工具类(一)

    我们java程序员在开发项目的是常常会用到一些工具类.今天我汇总了一下java中常用的工具方法.大家可以在项目中使用.可以收藏!加入IT江湖官方群:383126909 我们一起成长 一.String工 ...

  9. 【openGL】画直线

    #include "stdafx.h" #include <GL/glut.h> #include <stdlib.h> #include <math ...

  10. BI 项目管理之角色和职责

          DW/BI 系统在生命周期中需要许多不同的角色和技能,它们来自业务和技术领域.本文将介绍创建DW/BI 系统所涉及的主要角色.角色和人之间很少是一对一关系.与我们合作的团队小到只有一人,大 ...