HDFS架构

the core of HADOOP/distributed systems is storeage(HDFS) and resource manager(YARN) for computing engines built on it.

Master/Slave: The character of distribution system follows M/S pattern.

Name Node

NN is the master and single active node. it contains / manges the namespace of files/dirs so called metadata , keeps block location andallocate data nodes.

Metadata is just like the one in linux file system including file name,size,owner/user,group,permission(umask) and HDFS specified elements like block ids , replication factor and block size and so on. Meta is in both memory and disk for fast access and restore respectively. NN also contains block location but it does not save it which is actually from DN. When a client access HDFS, it talks to NN first , NN checks the the permission, file existing just like the normal operation in a non-distributed linux system.

Data Node

The one actually stores files in form of blocks.

once a NN starts up, it sends its block report to NN like 'I(NN-1) has blocks #1,#2....' and it sends it periodically. So,NN can build a mapping of which bock in which DNs to serve file access request. DN also sends heartbeat information every 3 seconds to report its status along with actual data storage(total, free, used space and data transfer current in progress....) which is used for block allocating and and load banlancing by NN. a DN is considered dead if NN does not receive the HB within 10 mins. What is about the replication? It is for fault tolerance. If a block lost or a DN goes down, there is other nodes containing the same blocks. Also, based on the replciation, the system will also automatically replica the blocks if the number of copies does not meet the level.

Rack is nothing but a box with machines. dedicated power suply and network switch.

HDFS写

It should be easy to understand if you know the hdfs arch. client uses the hdfs client lib to manuplicate files. The communication is done thru RPC call. As mentioned before, NN will check metadata to see if client is qualified to write files. Then clients request NN to allocates blocks and NN returns the DN list(NN knows the status for each DN). for client to write the first 128MB block which is accumulated in the client library's managed space. This is the leader-folower pattern.The write to DNs is a pipleline so it is synchronized writing? client/leader writes data in a 4k packets and followers sends ACK so i guess it is synchronized writing. DNs also send the information to NN once it recieved the blocks. So, NN can build the block location in the write process as well. Then it contines to write the next 128 MB blocks. It loops till reach the EOF of the file. finally the client close() and indicates the operation is completed.

HDFS读

As mentioned before, you need HDFS java client library to perform the read operation like open the file, read the stream.  client will call NN thru RPC to get the block id and block location. NN metadata has the block IDs for a file and the block location holds the mapping. Both of them are in memory and it should be fast. The actually read is between client and the DN. If the client is in a DN like a map task, the NN will return the block location with a list of network distance sort so the network delay will be reduced between racks. If the DN the map task is running on contains the blocks it needs, it will read directly locally. If the reading fails, client will switch to another node to read in the location list. Because of the data transfermation is between clients(you may have many current reads) and data nodes and name node only provides the block location, so, the load is distributed across the cluster. That's why HDFS is scalable. It may be not good to store vary large number of small files as name node may response poorly due to managing too much metadata.

When a DN is down

As we know, HDFS is reliable. If a data node goes down(NN does not receive its heart beat within 10 mins), there must be other data nodes storing the copies of the blocks in the failed data node depending on the replication level. So, the system is still available. But in this situation, the cluster is unser-replicated, and NN will schedule MR jobs to write the blocks to available data nodes to meet the replication level. When the data node(s) will send block report to NN , the block location will be updated accordingly.

Trade off among reliability, performance/network bandwidth

If you want to gain more reliability, for example, make the replication level to a big value (5?), the write operation will be very expensive(it not only involes writing data to disk of multiple data nodes but also count in tranferring data across data nodes so the performance is low) and vice versa.

HADOOP/HDFS Essay的更多相关文章

  1. Hadoop HDFS 用户指南

    This document is a starting point for users working with Hadoop Distributed File System (HDFS) eithe ...

  2. Hadoop HDFS负载均衡

    Hadoop HDFS负载均衡 转载请注明出处:http://www.cnblogs.com/BYRans/ Hadoop HDFS Hadoop 分布式文件系统(Hadoop Distributed ...

  3. Hive:org.apache.hadoop.hdfs.protocol.NSQuotaExceededException: The NameSpace quota (directories and files) of directory /mydir is exceeded: quota=100000 file count=100001

    集群中遇到了文件个数超出限制的错误: 0)昨天晚上spark 任务突然抛出了异常:org.apache.hadoop.hdfs.protocol.NSQuotaExceededException: T ...

  4. Hadoop程序运行中的Error(1)-Error: org.apache.hadoop.hdfs.BlockMissingException

    15/03/18 09:59:21 INFO mapreduce.Job: Task Id : attempt_1426641074924_0002_m_000000_2, Status : FAIL ...

  5. Hadoop HDFS编程 API入门系列之HDFS_HA(五)

    不多说,直接上代码. 代码 package zhouls.bigdata.myWholeHadoop.HDFS.hdfs3; import java.io.FileInputStream;import ...

  6. Hadoop HDFS编程 API入门系列之简单综合版本1(四)

    不多说,直接上代码. 代码 package zhouls.bigdata.myWholeHadoop.HDFS.hdfs4; import java.io.IOException; import ja ...

  7. [转]hadoop hdfs常用命令

    FROM : http://www.2cto.com/database/201303/198460.html hadoop hdfs常用命令   hadoop常用命令:  hadoop fs  查看H ...

  8. org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /user/hive/warehouse/page_view. Name node is in safe mode

    FAILED: Error in metadata: MetaException(message:Got exception: org.apache.hadoop.ipc.RemoteExceptio ...

  9. Hadoop HDFS文件常用操作及注意事项

    Hadoop HDFS文件常用操作及注意事项 1.Copy a file from the local file system to HDFS The srcFile variable needs t ...

随机推荐

  1. day 03 --Haproxy 增加, 删除,查询

    key 知识点:函数的定义, 函数的递归调用, flag 标志位的使用,eval() 函数 #!C:\Program Files\Python35\bin # -*- conding:utf-8 -* ...

  2. HP-UNIX平台修改Oracle processes参数报错:ORA-27154、ORA-27300、ORA-27301、ORA-27302

    OS 版本     :HP-UX B.11.31Oracle版本:11.2.0.4 (RAC) (一)问题描述 最近发现无法连接上数据库,报错信息为“ORA-00020:maximum number ...

  3. js检测是否可以访问公网服务器

    wifi认证开发过程所用到的,源码如下: 注:检测AC是否放行成功,是否可以访问公网阿里云服务器 功能调用: checkNet().then(function(res) { if(res) { //连 ...

  4. 『ACM C++』HDU杭电OJ | 1418 - 抱歉 (拓扑学:多面体欧拉定理引申)

    呕,大一下学期的第一周结束啦,一周过的挺快也挺多出乎意料的事情的~ 随之而来各种各样的任务也来了,嘛毕竟是大学嘛,有点上进心的人多多少少都会接到不少任务的,忙也正常啦~端正心态 开心面对就好啦~ 今天 ...

  5. MySQL学习之视图的使用

    视图基本操作 创建视图 视图的本质就是SQL指令(select语句) 基本语法:create view 视图名 as  select 指令; 在这里的select指令可以是单表数据,也可以是连接查询. ...

  6. Python模拟校园网登录

    最近忙着实验室的项目,学习的时间相对较少.前一段时间刚开始接触python时,依葫芦画瓢照着写了一个爬虫,爬取了某个网站的图片.当看到一张张图片自动出现在电脑屏幕上时,有些小小成就感.我想大多数人开始 ...

  7. 集群、RAC和MAA

    集群:是一种由两台或多台节点机构成的松散耦合的计算节点集合,这个集合在整个网络中表现为单一的系统,并通过单一接口进行使用和管理.给用户提供网络服务或应用程序的单一视图.大多数模式下,集群中所有计算机都 ...

  8. 本人擅长Ai、Fw、Fl、Br、Ae、Pr、Id、Ps等

    本人擅长Ai.Fw.Fl.Br.Ae.Pr.Id.Ps等软件的安装与卸载,精通CSS.JavaScript.PHP.ASP.C.C++.C#.Java.Ruby.Perl.Lisp.python.Ob ...

  9. C语言之一般树

    1.一般树 将这种一般的树转化成我们熟悉的单链表形式,这有三层,每一层都可以看成单链表或者多个分散的单链表 数据节点如下: struct tree {        int elem;        ...

  10. Python 爬虫 (一)

    爬: 爬一个网站需要几步? 确定用户的需求 根据需求,寻找网址 读取网页 urllib request requests 定位并提取数据 正则 xpath beautiful soup 存储数据 my ...