原文链接:http://blog.madhukaraphatak.com/secondary-namenode---what-it-really-do/

Secondary Namenode is one of the poorly named component in Hadoop. By its name, it gives a sense that its a backup for the Namenode.But in reality its not. Lot of beginners in Hadoop get confused about what exactly SecondaryNamenode does and why its present in HDFS.So in this blog post I try to explain the role of secondary namenode in HDFS.

By its name, you may assume that it has something to do with Namenode and you are right. So before we dig into Secondary Namenode lets see what exactly Namenode does.

Namenode

Namenode holds the meta data for the HDFS like Namespace information, block information etc. When in use, all this information is stored in main memory. But these information also stored in disk for persistence storage.

The above image shows how Name Node stores information in disk.
Two different files are

  1. fsimage - Its the snapshot of the filesystem when namenode started
  2. Edit logs - Its the sequence of changes made to the filesystem after namenode started

Only in the restart of namenode , edit logs are applied to fsimage to get the latest snapshot of the file system. But namenode restart are rare in production clusters which means edit logs can grow very large for the clusters where namenode runs for a long period of time. The following issues we will encounter in this situation.

  1. Editlog become very large , which will be challenging to manage it
  2. Namenode restart takes long time because lot of changes has to be merged
  3. In the case of crash, we will lost huge amount of metadata since fsimage is very old

So to overcome this issues we need a mechanism which will help us reduce the edit log size which is manageable and have up to date fsimage ,so that load on namenode reduces . It’s very similar to Windows Restore point, which will allow us to take snapshot of the OS so that if something goes wrong , we can fallback to the last restore point.

So now we understood NameNode functionality and challenges to keep the meta data up to date.So what is this all have to with Seconadary Namenode?

Secondary Namenode

Secondary Namenode helps to overcome the above issues by taking over responsibility of merging editlogs with fsimage from the namenode.

The above figure shows the working of Secondary Namenode

  1. It gets the edit logs from the namenode in regular intervals and applies to fsimage
  2. Once it has new fsimage, it copies back to namenode
  3. Namenode will use this fsimage for the next restart,which will reduce the startup time

Secondary Namenode whole purpose is to have a checkpoint in HDFS. Its just a helper node for namenode.That’s why it also known as checkpoint node inside the community.

So we now understood all Secondary Namenode does puts a checkpoint in filesystem which will help Namenode to function better. Its not the replacement or backup for the Namenode. So from now on make a habit of calling it as a checkpoint node.

Secondary Namenode - What it really do?的更多相关文章

  1. Secondary NameNode:的作用?

    前言 最近刚接触Hadoop, 一直没有弄明白NameNode和Secondary NameNode的区别和关系.很多人都认为,Secondary NameNode是NameNode的备份,是为了防止 ...

  2. Hadoop之Secondary NameNode

    NameNode存储文件系统的变化作为log追加在本地的一个文件里:这个文件是edits.当一个NameNode启动时,它从一个映像文件:FsImage,读取HDFS的状态,使用来自edits日志文件 ...

  3. 解读Secondary NameNode的功能

    1.概述 最近有朋友问我Secondary NameNode的作用,是不是NameNode的备份?是不是为了防止NameNode的单点问题?确实,刚接触Hadoop,从字面上看,很容易会把Second ...

  4. Secondary NameNode 的作用

    https://blog.csdn.net/xh16319/article/details/31375197 很多人都认为,Secondary NameNode是NameNode的备份,是为了防止Na ...

  5. (转)Secondary NameNode的作用

    在Hadoop中,有一些命名不好的模块,Secondary NameNode是其中之一.从它的名字上看,它给人的感觉就像是NameNode的备份.但它实际上却不是.很多Hadoop的初学者都很疑惑,S ...

  6. 010 secondary namenode(同步元数据和日志)

    1.格式化 首先格式化之后只剩下一个根目录. 格式化后会出现元数据 集群启动之后,元数据放在内存中的(消耗内存中) 格式化后会产生镜像文件fsimage,元数据存储 启动的时候namenode会读取镜 ...

  7. Secondary NameNode究竟是做什么的

    Secondary NameNode:它究竟有什么作用? 在hadoop中,有一些命名不好的模块,Secondary NameNode是其中之一.从它的名字上看,它给人的感觉就像是NameNode的备 ...

  8. Hadoop- NameNode和Secondary NameNode元数据管理机制

    元数据的存储机制 A.内存中有一份完整的元数据(内存meta data) B.磁盘有一个“准完整”的元数据镜像(fsimage)文件(在namenode的工作目录中) C.用于衔接内存metadata ...

  9. NameNode && Secondary NameNode工作机制

    NameNode && Secondary NameNode工作机制 1)工作流程 2)  fsimage和edits NameNode是HDFS的大脑,它维护着整个文件系统的目录树, ...

随机推荐

  1. 基于Jquery 简单实用的弹出提示框

    基于Jquery 简单实用的弹出提示框 引言: 原生的 alert 样子看起来很粗暴,网上也有一大堆相关的插件,但是基本上都是大而全,仅仅几句话可以实现的东西,可能要引入好几十k的文件,所以话了点时间 ...

  2. .NET代码自动编译发布

    .NET代码自动编译发布   因本人一直使用.NET开发,在做项目的时候,每次都要涉及到各个环境的部署问题,手工操作容易出错,并且重复劳动多,所以一直在寻找一个能实现自动化部署的方案. 废话不多讲,先 ...

  3. Android入门之环境搭建

    欢迎访问我的新博客:http://www.milkcu.com/blog/ 原文地址:http://www.milkcu.com/blog/archives/1376935560.html 原创:An ...

  4. SVN 服务端 和 客户端

    网址大全  |  EF CodeFirst  |  电视  |  MyNPOI  |  开源  |  我的皮肤  |  ASP.NET MVC4  |  前端提升  |  LINQ  |  WCF   ...

  5. 欧几里德算法及其扩展(推导&&模板)

    有关欧几里德算法整理: 1.一些相关概念: <1>.整除性与约数: ①一个整数可以被另外一个整数整除即为d|a(表示d整除a,通俗的说是a可以被d整除),其含义也可以说成,存在某个整数k, ...

  6. 在Netbeans上配置Android开发环境

    在NetBeans下开发Android的所需要的基本条件:NetBeans(包含JDK)+Android SDK+NBAndroid(为Netbeans设计的Android 开发插件) 详情:http ...

  7. ToList<>()所带来的性能影响

    ToList<>()所带来的性能影响  前几天优化师弟写的代码,有一个地方给我留下很深刻的印象,就是我发现他总是将PLINQ的结果ToList<>(),然后再返回给主程序,对于 ...

  8. (*p)++ 与 *p++ 与 ++*p 拨开一团迷雾

    (*p)++ 与 *p++ 与 ++*p 拨开一团迷雾 环境:win7 IDE:DEV-C++ 编译器:GCC 1.先说++i和i++的基础 代码如下: ? 1 2 3 4 5 6 7 8 9 10 ...

  9. 验证视图状态 MAC 失败,解决方法

    错误信息 今天调试一个带cookie表单提交的页面时,浏览器中报错提示:验证视图状态 MAC 失败.如果此应用程序由网络场或群集承载,请确保 <machineKey> 配置指定了相同的 v ...

  10. C#多线程,线程锁

    ];             ; i < ; i++) {                 threads[i]= ; i < ; i++) {                     R ...