Secondary Namenode - What it really do?
原文链接:http://blog.madhukaraphatak.com/secondary-namenode---what-it-really-do/
Secondary Namenode is one of the poorly named component in Hadoop. By its name, it gives a sense that its a backup for the Namenode.But in reality its not. Lot of beginners in Hadoop get confused about what exactly SecondaryNamenode does and why its present in HDFS.So in this blog post I try to explain the role of secondary namenode in HDFS.
By its name, you may assume that it has something to do with Namenode and you are right. So before we dig into Secondary Namenode lets see what exactly Namenode does.
Namenode
Namenode holds the meta data for the HDFS like Namespace information, block information etc. When in use, all this information is stored in main memory. But these information also stored in disk for persistence storage.
The above image shows how Name Node stores information in disk.
Two different files are
- fsimage - Its the snapshot of the filesystem when namenode started
- Edit logs - Its the sequence of changes made to the filesystem after namenode started
Only in the restart of namenode , edit logs are applied to fsimage to get the latest snapshot of the file system. But namenode restart are rare in production clusters which means edit logs can grow very large for the clusters where namenode runs for a long period of time. The following issues we will encounter in this situation.
- Editlog become very large , which will be challenging to manage it
- Namenode restart takes long time because lot of changes has to be merged
- In the case of crash, we will lost huge amount of metadata since fsimage is very old
So to overcome this issues we need a mechanism which will help us reduce the edit log size which is manageable and have up to date fsimage ,so that load on namenode reduces . It’s very similar to Windows Restore point, which will allow us to take snapshot of the OS so that if something goes wrong , we can fallback to the last restore point.
So now we understood NameNode functionality and challenges to keep the meta data up to date.So what is this all have to with Seconadary Namenode?
Secondary Namenode
Secondary Namenode helps to overcome the above issues by taking over responsibility of merging editlogs with fsimage from the namenode.
The above figure shows the working of Secondary Namenode
- It gets the edit logs from the namenode in regular intervals and applies to fsimage
- Once it has new fsimage, it copies back to namenode
- Namenode will use this fsimage for the next restart,which will reduce the startup time
Secondary Namenode whole purpose is to have a checkpoint in HDFS. Its just a helper node for namenode.That’s why it also known as checkpoint node inside the community.
So we now understood all Secondary Namenode does puts a checkpoint in filesystem which will help Namenode to function better. Its not the replacement or backup for the Namenode. So from now on make a habit of calling it as a checkpoint node.
Secondary Namenode - What it really do?的更多相关文章
- Secondary NameNode:的作用?
前言 最近刚接触Hadoop, 一直没有弄明白NameNode和Secondary NameNode的区别和关系.很多人都认为,Secondary NameNode是NameNode的备份,是为了防止 ...
- Hadoop之Secondary NameNode
NameNode存储文件系统的变化作为log追加在本地的一个文件里:这个文件是edits.当一个NameNode启动时,它从一个映像文件:FsImage,读取HDFS的状态,使用来自edits日志文件 ...
- 解读Secondary NameNode的功能
1.概述 最近有朋友问我Secondary NameNode的作用,是不是NameNode的备份?是不是为了防止NameNode的单点问题?确实,刚接触Hadoop,从字面上看,很容易会把Second ...
- Secondary NameNode 的作用
https://blog.csdn.net/xh16319/article/details/31375197 很多人都认为,Secondary NameNode是NameNode的备份,是为了防止Na ...
- (转)Secondary NameNode的作用
在Hadoop中,有一些命名不好的模块,Secondary NameNode是其中之一.从它的名字上看,它给人的感觉就像是NameNode的备份.但它实际上却不是.很多Hadoop的初学者都很疑惑,S ...
- 010 secondary namenode(同步元数据和日志)
1.格式化 首先格式化之后只剩下一个根目录. 格式化后会出现元数据 集群启动之后,元数据放在内存中的(消耗内存中) 格式化后会产生镜像文件fsimage,元数据存储 启动的时候namenode会读取镜 ...
- Secondary NameNode究竟是做什么的
Secondary NameNode:它究竟有什么作用? 在hadoop中,有一些命名不好的模块,Secondary NameNode是其中之一.从它的名字上看,它给人的感觉就像是NameNode的备 ...
- Hadoop- NameNode和Secondary NameNode元数据管理机制
元数据的存储机制 A.内存中有一份完整的元数据(内存meta data) B.磁盘有一个“准完整”的元数据镜像(fsimage)文件(在namenode的工作目录中) C.用于衔接内存metadata ...
- NameNode && Secondary NameNode工作机制
NameNode && Secondary NameNode工作机制 1)工作流程 2) fsimage和edits NameNode是HDFS的大脑,它维护着整个文件系统的目录树, ...
随机推荐
- 字符串的使用(string,StringBuffer,StringBuilder)
String中==与equals的区别:==比较字符串中的引用相等equals比较字符串中的内容相等(因为字符串有重写equals方法) string常用的方法 返回类型 方法 操作功能 Char c ...
- (莱昂氏unix源代码分析导读-49) 字符缓冲区
by cszhao1980 同块设备一样,对字符设备的输入输出也是通过缓冲区来进行的.使用缓冲区有个额外 的好处,即以缓冲区为界,函数可分为高低两个层次.低层函数负责与实际设备交互, 而高层函数只与缓 ...
- C#动态表达式计算(续2)
上两篇废话太多,这一次我就不多说了,由于代码比较简单,可以直接从https://github.com/scottshare/DynamicExpress.git地址下载. 以下说明一下使用方法: Dy ...
- 一个逗比web前端的理想
加入博客园也5年多了.平时博客也写的少,但是每天都会上来“偷窥”,你们的一举一动我都知道.呵呵~ 最近看了些小伙伴们写的一些文章自己也难免有些感触.最近也精神有些萎靡,也想写点什么记录下个人的成长经历 ...
- 【python】初识python的问题
这两天利用晚上时间简单的了解了一下python语言,在Mac上和Windows上都安装了python,对比两个平台,还是发现在mac上体验比较好一点.安装的版本好像也不一样,语法还有点小区别.简单的对 ...
- 高级SQL特性
SQL SQL 必知必会·笔记<20>高级SQL特性 摘要: 约束(constraint)就是管理如何插入或处理数据库数据的规则.DBMS通过在数据库表上施加约束来实施引用完整性.1. ...
- StringEscapeUtils.unescapeHtml的使用
在做代码高亮时,从数据库中取出代码如下(节选): <pre class="brush: java;"> 需要的应该是<pre class=\"brush ...
- C# 枚举常用工具方法
/// <summary> /// 获取枚举成员描述信息及名称 /// 返回:IDictionary /// Value:描述信息 /// Key:值 /// </summary&g ...
- 关于前端JS模块加载器实现的一些细节
最近工作需要,实现一个特定环境的模块加载方案,实现过程中有一些技术细节不解,便参考 了一些项目的api设计约定与实现,记录下来备忘. 本文不探讨为什么实现模块化,以及模块化相关的规范,直接考虑一些技术 ...
- Linux 学习 step by step (2)
Linux 学习 step by step (2) Linux,想要我说爱你真的不容易了,尽管,你是ubutun,尽管,你有蛮界面.但是,操作你,还是没有操作windows那么的如鱼得水了.为了更 ...