hadoop namenode格式化问题汇总

(持续更新)

0 Hadoop集群环境

3台rhel6.4,2个namenode+2个zkfc, 3个journalnode+zookeeper-server 组成一个最简单的HA集群方案。

1) hdfs-site.xml配置如下:

<?xml version="1.0" ?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Quorum Journal Manager HA:
  http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
-->
<configuration>
    <!-- Quorum Journal Manager HA -->
    <property>
        <name>dfs.nameservices</name>
        <value>hacl</value>
        <description>unique identifiers for each NameNode in the nameservice.</description>
    </property>

    <property>
        <name>dfs.ha.namenodes.hacl</name>
        <value>hn1,hn2</value>
        <description>Configure with a list of comma-separated NameNode IDs.</description>
    </property>

    <property>
        <name>dfs.namenode.rpc-address.hacl.hn1</name>
        <value>hacl-node1.pepstack.com:8020</value>
        <description>the fully-qualified RPC address for each NameNode to listen on.</description>
    </property>

    <property>
        <name>dfs.namenode.rpc-address.hacl.hn2</name>
        <value>hacl-node2.pepstack.com:8020</value>
        <description>the fully-qualified RPC address for each NameNode to listen on.</description>
    </property>

    <property>
        <name>dfs.namenode.http-address.hacl.hn1</name>
        <value>hacl-node1.pepstack.com:50070</value>
        <description>the fully-qualified HTTP address for each NameNode to listen on.</description>
    </property>

    <property>
        <name>dfs.namenode.http-address.hacl.hn2</name>
        <value>hacl-node2.pepstack.com:50070</value>
        <description>the fully-qualified HTTP address for each NameNode to listen on.</description>
    </property>

    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://hacl-node1.pepstack.com:8485;hacl-node2.pepstack.com:8485;hacl-node3.pepstack.com:8485/hacl</value>
        <description>the URI which identifies the group of JNs where the NameNodes will write or read edits.</description>
    </property>

    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/hacl/data/dfs/jn</value>
        <description>the path where the JournalNode daemon will store its local state.</description>
    </property>

    <property>
        <name>dfs.client.failover.proxy.provider.hacl</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
        <description>the Java class that HDFS clients use to contact the Active NameNode.</description>
    </property>

    <!-- Automatic failover adds two new components to an HDFS deployment:
        - a ZooKeeper quorum;
        - the ZKFailoverController process (abbreviated as ZKFC).
        Configuring automatic failover:
    -->
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence</value>
        <description>a list of scripts or Java classes which will be used to fence the Active NameNode during a failover.</description>
    </property>

    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/var/lib/hadoop-hdfs/.ssh/id_dsa</value>
        <description>The sshfence option SSHes to the target node and uses fuser to kill the process
          listening on the service's TCP port. In order for this fencing option to work, it must be
          able to SSH to the target node without providing a passphrase. Thus, one must also configure the
          dfs.ha.fencing.ssh.private-key-files option, which is a comma-separated list of SSH private key files.
             logon namenode machine:
             cd /var/lib/hadoop-hdfs
             su hdfs
             ssh-keygen -t dsa
        </description>
    </property>
    <!-- Optionally, one may configure a non-standard username or port to perform the SSH.
      One may also configure a timeout, in milliseconds, for the SSH, after which this
      fencing method will be considered to have failed. It may be configured like so:
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence([[username][:port]])</value>
    </property>
    <property>
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>30000</value>
    </property>
    //-->

    <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>

    <property>
        <name>dfs.ha.automatic-failover.enabled.hacl</name>
        <value>true</value>
    </property>

    <!-- Configurations for NameNode: -->
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/hacl/data/dfs/nn</value>
        <description>Path on the local filesystem where the NameNode stores the namespace and transactions logs persistently.</description>
    </property>

    <property>
        <name>dfs.blocksize</name>
        <value>268435456</value>
        <description>HDFS blocksize of 256MB for large file-systems.</description>
    </property>

    <property>
        <name>dfs.replication</name>
        <value>3</value>
        <description></description>
    </property>

    <property>
        <name>dfs.namenode.handler.count</name>
        <value>100</value>
        <description>More NameNode server threads to handle RPCs from large number of DataNodes.</description>
    </property>

    <!-- Configurations for DataNode: -->
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/hacl/data/dfs/dn</value>
        <description>Comma separated list of paths on the local filesystem of a DataNode where it should store its blocks.</description>
    </property>
</configuration>

2) core-site.xml配置如下:

<?xml version="1.0" ?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hacl</value>
    </property>

    <property>
        <name>io.file.buffer.size</name>
        <value>131072</value>
    </property>

    <property>
        <name>hadoop.tmp.dir</name>
        <value>/hdfs/data/tmp</value>
        <description>chown -R hdfs:hdfs hadoop_tmp_dir</description>
    </property>

    <!-- Configuring automatic failover -->
    <property>
        <name>ha.zookeeper.quorum</name>
        <value>hacl-node1.pepstack.com:2181,hacl-node2.pepstack.com:2181,hacl-node3.pepstack.com:2181</value>
        <description>This lists the host-port pairs running the ZooKeeper service.</description>
    </property>

    <!-- Securing access to ZooKeeper -->

</configuration>

1. namenode格式化过程如下:

1) 启动所有journalnode,必须3个节点的JN都正确启动。关闭所有的namenode:

# service hadoop-hdfs-journalnode start
# service hadoop-hdfs-namenode stop

2) namenode格式化。hacl-pepstack-com是我给集群起的名字,可以忽略。su - hdfs -c "..." 表示以hdfs用户格式化。

hdfs-site.xml和core-site.xml上指定的所有目录都必须赋予正确的权限:

# chown -R hdfs:hdfs /hacl/data/dfs

然后在任何一个namenode上格式化,比如在hn1上执行

########## hn1
# su - hdfs -c "hdfs namenode -format -clusterid hacl-pepstack-com -force"
# service hadoop-hdfs-namenode start   ##### hn1

首先必须把刚格式化好的hn1启动,然后在另一个namenode上(hn2)执行:

########## hn2
# su - hdfs -c "hdfs namenode -bootstrapStandby -force"
# service hadoop-hdfs-namenode start   ##### hn2

至此,2个namenode都格式化并且启动好了。

hadoop namenode格式化问题汇总的更多相关文章

  1. Hadoop源码:namenode格式化和启动过程实现

    body { margin: 0 auto; font: 13px / 1 Helvetica, Arial, sans-serif; color: rgba(68, 68, 68, 1); padd ...

  2. hdfs格式化hadoop namenode -format错误

    在对HDFS格式化,执行hadoop namenode -format命令时,出现未知的主机名的问题,异常信息如下所示: [shirdrn@localhost bin]$ hadoop namenod ...

  3. hadoop namenode多次格式化后,导致datanode启动不了

    jps hadoop namenode -format dfs directory : /home/hadoop/dfs --data --current/VERSION #Wed Jul :: CS ...

  4. Hadoop笔记——技术点汇总

    目录 · 概况 · Hadoop · 云计算 · 大数据 · 数据挖掘 · 手工搭建集群 · 引言 · 配置机器名 · 调整时间 · 创建用户 · 安装JDK · 配置文件 · 启动与测试 · Clo ...

  5. Hadoop namenode无法启动

    最近遇到了一个问题,执行start-all.sh的时候发现JPS一下namenode没有启动        每次开机都得重新格式化一下namenode才可以        其实问题就出在tmp文件,默 ...

  6. namenode无法启动(namenode格式化失败)

    格式化namenode root@node04 bin]# sudo -u hdfs hdfs namenode –format 16/11/14 10:56:51 INFO namenode.Nam ...

  7. Hadoop重新格式化HDFS的方法

    1.查看hdfs-site.xml: <property> <name>dfs.name.dir</name> <value>/home/hadoop/ ...

  8. Hadoop记录-Hadoop NameNode 高可用 (High Availability) 实现解析

    Hadoop NameNode 高可用 (High Availability) 实现解析   NameNode 高可用整体架构概述 在 Hadoop 1.0 时代,Hadoop 的两大核心组件 HDF ...

  9. 对hadoop namenode -format执行过程的探究

      引言 本文出于一个疑问:hadoop namenode -format到底在我的linux系统里面做了些什么? 步骤 第1个文件bin/hadoop Hadoop脚本位于hadoop根目录下的bi ...

随机推荐

  1. 我的第一本著作:Spark技术内幕上市!

    现在各大网站销售中! 京东:http://item.jd.com/11770787.html 当当:http://product.dangdang.com/23776595.html 亚马逊:http ...

  2. iOS 10.0之前和之后的Local Notification有神马不同

    在iOS 10.0之前apple还没有将通知功能单独拿出来自成一系.而从10.0开始原来的本地通知仍然可用,只是被标记为过时.于是乎我们可以使用10.0全新的通知功能.别急-让我们慢慢来,先从iOS ...

  3. Java异常处理-----程序中的异常处理.启蒙

    1.当除数是非0,除法运算完毕,程序继续执行. 2.当除数是0,程序发生异常,并且除法运算之后的代码停止运行.因为程序发生异常需要进行处理. class Demo { public static vo ...

  4. Redis 学习笔记3:Jedis 连接虚拟机下的Redis 服务

    Jedis 是 Redis 官方首选的 Java 客户端开发包. 虚拟机的IP地址是192.168.8.88. Jedis代码是放在windows上的,启动虚拟机上的Redis服务之后,用Jedis连 ...

  5. RxJava操作符(06-错误处理)

    转载请标明出处: http://blog.csdn.net/xmxkf/article/details/51658235 本文出自:[openXu的博客] 目录: Catch Retry 源码下载 1 ...

  6. 用Maven打包成EAR部署JBoss

    基于原理的架构里面,考虑这次升级版本,可谓是一步一个脚印的向上走啊,可以说步步为坎,别人的知识,和自己的知识,相差很多啊,什么都懂点,但是具体没有使用,就理解不深刻了,心有余而力不足,所以一切我们自己 ...

  7. C语言与java语言中数据类型的差别总结

    在学习java的时候,看到char ch =  '男' ; 我就觉得很奇怪,char类型不是占用一个字节吗?为什么定义成一个汉字被说成是一个字符了? 原来,在C语言中,char在32位操作系统下占用1 ...

  8. C库源码中的移位函数

    #include <stdio.h> /* _lrotr()将一个无符号长整形数左循环移位的函数 原形:unsigned long _lrotr(unsigned long value,i ...

  9. TCP连接建立系列 — 客户端接收SYNACK和发送ACK

    主要内容:客户端接收SYNACK.发送ACK,完成连接的建立. 内核版本:3.15.2 我的博客:http://blog.csdn.net/zhangskd 接收入口 tcp_v4_rcv |--&g ...

  10. 转义字符\(在hive+shell以及java中注意事项):正则表达式的转义字符为双斜线,split函数解析也是正则

    转义字符 将后边字符转义,使特殊功能字符作为普通字符处理,或者普通字符转化为特殊功能字符. 各个语言中都用应用,如java.python.sql.hive.shell等等. 如sql中 "\ ...