Secondary Namenode - What it really do?
原文链接:http://blog.madhukaraphatak.com/secondary-namenode---what-it-really-do/
Secondary Namenode is one of the poorly named component in Hadoop. By its name, it gives a sense that its a backup for the Namenode.But in reality its not. Lot of beginners in Hadoop get confused about what exactly SecondaryNamenode does and why its present in HDFS.So in this blog post I try to explain the role of secondary namenode in HDFS.
By its name, you may assume that it has something to do with Namenode and you are right. So before we dig into Secondary Namenode lets see what exactly Namenode does.
Namenode
Namenode holds the meta data for the HDFS like Namespace information, block information etc. When in use, all this information is stored in main memory. But these information also stored in disk for persistence storage.

The above image shows how Name Node stores information in disk.
Two different files are
- fsimage - Its the snapshot of the filesystem when namenode started
- Edit logs - Its the sequence of changes made to the filesystem after namenode started
Only in the restart of namenode , edit logs are applied to fsimage to get the latest snapshot of the file system. But namenode restart are rare in production clusters which means edit logs can grow very large for the clusters where namenode runs for a long period of time. The following issues we will encounter in this situation.
- Editlog become very large , which will be challenging to manage it
- Namenode restart takes long time because lot of changes has to be merged
- In the case of crash, we will lost huge amount of metadata since fsimage is very old
So to overcome this issues we need a mechanism which will help us reduce the edit log size which is manageable and have up to date fsimage ,so that load on namenode reduces . It’s very similar to Windows Restore point, which will allow us to take snapshot of the OS so that if something goes wrong , we can fallback to the last restore point.
So now we understood NameNode functionality and challenges to keep the meta data up to date.So what is this all have to with Seconadary Namenode?
Secondary Namenode
Secondary Namenode helps to overcome the above issues by taking over responsibility of merging editlogs with fsimage from the namenode.

The above figure shows the working of Secondary Namenode
- It gets the edit logs from the namenode in regular intervals and applies to fsimage
- Once it has new fsimage, it copies back to namenode
- Namenode will use this fsimage for the next restart,which will reduce the startup time
Secondary Namenode whole purpose is to have a checkpoint in HDFS. Its just a helper node for namenode.That’s why it also known as checkpoint node inside the community.
So we now understood all Secondary Namenode does puts a checkpoint in filesystem which will help Namenode to function better. Its not the replacement or backup for the Namenode. So from now on make a habit of calling it as a checkpoint node.
Secondary Namenode - What it really do?的更多相关文章
- Secondary NameNode:的作用?
前言 最近刚接触Hadoop, 一直没有弄明白NameNode和Secondary NameNode的区别和关系.很多人都认为,Secondary NameNode是NameNode的备份,是为了防止 ...
- Hadoop之Secondary NameNode
NameNode存储文件系统的变化作为log追加在本地的一个文件里:这个文件是edits.当一个NameNode启动时,它从一个映像文件:FsImage,读取HDFS的状态,使用来自edits日志文件 ...
- 解读Secondary NameNode的功能
1.概述 最近有朋友问我Secondary NameNode的作用,是不是NameNode的备份?是不是为了防止NameNode的单点问题?确实,刚接触Hadoop,从字面上看,很容易会把Second ...
- Secondary NameNode 的作用
https://blog.csdn.net/xh16319/article/details/31375197 很多人都认为,Secondary NameNode是NameNode的备份,是为了防止Na ...
- (转)Secondary NameNode的作用
在Hadoop中,有一些命名不好的模块,Secondary NameNode是其中之一.从它的名字上看,它给人的感觉就像是NameNode的备份.但它实际上却不是.很多Hadoop的初学者都很疑惑,S ...
- 010 secondary namenode(同步元数据和日志)
1.格式化 首先格式化之后只剩下一个根目录. 格式化后会出现元数据 集群启动之后,元数据放在内存中的(消耗内存中) 格式化后会产生镜像文件fsimage,元数据存储 启动的时候namenode会读取镜 ...
- Secondary NameNode究竟是做什么的
Secondary NameNode:它究竟有什么作用? 在hadoop中,有一些命名不好的模块,Secondary NameNode是其中之一.从它的名字上看,它给人的感觉就像是NameNode的备 ...
- Hadoop- NameNode和Secondary NameNode元数据管理机制
元数据的存储机制 A.内存中有一份完整的元数据(内存meta data) B.磁盘有一个“准完整”的元数据镜像(fsimage)文件(在namenode的工作目录中) C.用于衔接内存metadata ...
- NameNode && Secondary NameNode工作机制
NameNode && Secondary NameNode工作机制 1)工作流程 2) fsimage和edits NameNode是HDFS的大脑,它维护着整个文件系统的目录树, ...
随机推荐
- ASP.NET MVC + EF 利用存储过程读取大数据
ASP.NET MVC + EF 利用存储过程读取大数据,1亿数据测试很OK 看到本文的标题,相信你会忍不住进来看看! 没错,本文要讲的就是这个重量级的东西,这个不仅仅支持单表查询,更能支持连接查询, ...
- linux服务创建及jq配置服务列表查看
1.应用背景 随着业务需求,后台处理服务不断增多,对于这些服务或后台程序的查看.更新操作越来越凌乱,所以我们首先需要一个服务列表查看工具,方便查看各 服务的端口.运行状态.jar包路径等等. 2.创建 ...
- Node填坑教程——前言
Node是什么? Node 是一个服务器端 JavaScript 解释器,它将改变服务器应该如何工作的概念.它的目标是帮助程序员构建高度可伸缩的应用程序,编写能够处理数万条同时连接到一个(只有一个)物 ...
- 在winform中怎样实现好看的treeview样式
这是在网上截取的一张图,就是想做成这样的效果,不能用devExperss控件,主要是不知道他的那个“组织机构列表“用的是不是panel,怎样弄的样式
- 使用LFM(Latent factor model)隐语义模型进行Top-N推荐
最近在拜读项亮博士的<推荐系统实践>,系统的学习一下推荐系统的相关知识.今天学习了其中的隐语义模型在Top-N推荐中的应用,在此做一个总结. 隐语义模型LFM和LSI,LDA,Topic ...
- Eclipse添加Web和java EE插件
1.在Eclipse中菜单help选项中选择install new software选项 2.在work with 栏中输入 Juno - http://download.eclipse.org/re ...
- baidu 200兆SVN代码服务器
转 今天心情好,给各位免费呈上200兆SVN代码服务器一枚,不谢! 开篇先给大家讲个我自己的故事,几个月前在网上接了个小软件开发的私活,平日上班时间也比较忙,就中午一会儿休息时间能抽出来倒腾着去做 ...
- MyEclipse取消Show in Breadcrumb的方法
eclipse中的Show in Breadcrumb是快速导航条,可以清晰的看到我们当前的类,属性或方法的导航 定位. 如果不喜欢的话,取消掉的方法如下: Window -> Customiz ...
- LINQ to XML LINQ学习第一篇
LINQ to XML LINQ学习第一篇 1.LINQ to XML类 以下的代码演示了如何使用LINQ to XML来快速创建一个xml: public static void CreateDoc ...
- offsetWidth, offsetHeight, offsetLeft, offsetTop,clientWidth, clientHeight,clientX,pageX,screenX
offsetWidth: 元素在水平方向上占用的空间大小.包括元素的宽度,内边距,(可见的)垂直滚动条的宽度,左右边框的宽度. offsetHeight:元素在垂直方向上占用的空间大小,包括元素的高度 ...