Fix Corrupt Blocks on HDFS
来自:http://centoshowtos.org/hadoop/fix-corrupt-blocks-on-hdfs/
How do I know if my hadoop hdfs filesystem has corrupt blocks, and how do I fix it?
The easiest way to determine this is to run an fsck on the filesystem. If you have setup your hadoop environment variables you should be able to use a path of /, if not hdfs://ip.or.hostname:50070/.
hdfs fsck /
or.
hdfs fsck hdfs://ip.or.hostname:50070/
If the end of your output looks something like this, you have corrupt blocks on your fs.
.............................Status: CORRUPT
Total size: 3453345169348 B (Total open files size: 664 B)
Total dirs: 15233
Total files: 14029
Total symlinks: 0 (Files currently being written: 8)
Total blocks (validated): 40961 (avg. block size 84308126 B) (Total open file blocks (not validated): 8)
********************************
CORRUPT FILES: 2
MISSING BLOCKS: 2
MISSING SIZE: 15731297 B
CORRUPT BLOCKS: 2
********************************
Corrupt blocks: 2
Number of data-nodes: 12
Number of racks: 2
FSCK ended at Fri Mar 27 XX:03:21 UTC 201X in XXX milliseconds The filesystem under path '/' is CORRUPT
How do I know which files have blocks that are corrupt?
The output of the fsck above will be very verbose, but it will mention which blocks are corrupt. We can do some grepping of the fsck above so that we aren't "reading through a firehose".
hdfs fsck / | egrep -v '^\.+$' | grep -v replica | grep -v Replica
or
hdfs fsck hdfs://ip.or.host:50070/ | egrep -v '^\.+$' | grep -v replica | grep -v Replica
This will list the affected files, and the output will not be a bunch of dots, and also files that might currently have under-replicated blocks (which isn't necessarily an issue). The output should include something like this with all your affected files.
/path/to/filename.fileextension: CORRUPT blockpool BP-1016133662-10.29.100.41-1415825958975 block blk_1073904305 /path/to/filename.fileextension: MISSING 1 blocks of total size 15620361 B
The next step would be to determine the importance of the file, can it just be removed and copied back into place, or is there sensitive data that needs to be regenerated?
If it's easy enough just to replace the file, that's the route I would take.
Remove the corrupted file from your hadoop cluster
This command will move the corrupted file to the trash.
hdfs dfs -rm /path/to/filename.fileextension
hdfs dfs -rm hdfs://ip.or.hostname.of.namenode:50070/path/to/filename.fileextension
Or you can skip the trash to permanently delete (which is probably what you want to do)
hdfs dfs -rm -skipTrash /path/to/filename.fileextension
hdfs dfs -rm -skipTrash hdfs://ip.or.hostname.of.namenode:50070/path/to/filename.fileextension
How would I repair a corrupted file if it was not easy to replace?
This might or might not be possible, but the first step would be to gather information on the file's location, and blocks.
hdfs fsck /path/to/filename/fileextension -locations -blocks -files
hdfs fsck hdfs://ip.or.hostname.of.namenode:50070/path/to/filename/fileextension -locations -blocks -files
From this data, you can track down the node where the corruption is. On those nodes, you can look through logs and determine what the issue is. If a disk was replaced, i/o errors on the server, etc. If possible to recover on that machine and get the partition with the blocks online that would report back to hadoop and the file would be healthy again. If that isn't possible, you will unforunately have to find another way to regenerate.
Fix Corrupt Blocks on HDFS的更多相关文章
- How to fix corrupt HDFS FIles
		
1 问题描述 HDFS在机器断电或意外崩溃的情况下,有可能出现正在写的数据(例如保存在DataNode内存的数据等)丢失的问题.再次重启HDFS后,发现hdfs无法启动,查看日志后发现,一直处于安全模 ...
 - ORA-19566: exceeded limit of 0 corrupt blocks for file E:\xxxx\<datafilename>.ORA.
		
How to Format Corrupted Block Not Part of Any Segment (Doc ID 336133.1) To BottomTo Bottom In this D ...
 - 手动修复 under-replicated blocks in HDFS
		
解决方式步骤: 1.进入hdfs的pod kubectl get pod -o wide | grep hdfs kubectl exec -ti hadoop-hdfs-namenode-hdfs1 ...
 - hdfs 如何实现退役节点快速下线(也就是退役节点上的数据块快速迁移)speed up decommission blocks removal
		
以下是选择复制源节点的代码 代码总结: A=datanode上要复制block的Queue size与 target datanode没被选出之前待处理复制工作数之和. 1. 优先选择退役中的节点,因 ...
 - [bigdata] 使用Flume hdfs sink, hdfs文件未关闭的问题
		
现象: 执行mapreduce任务时失败 通过hadoop fsck -openforwrite命令查看发现有文件没有关闭. [root@com ~]# hadoop fsck -openforwri ...
 - hdfs 常用命令
		
(2)bin/hdfs dfs -mkdir -p /home/雨渐渐 (3)scp /media/root/DCE28B65E28B432E/download/第2周/ChinaHadoop第二讲\ ...
 - 【原创】大数据基础之HDFS(1)HDFS新创建文件如何分配Datanode
		
HDFS中的File由Block组成,一个File包含一个或多个Block,当创建File时会创建一个Block,然后根据配置的副本数量(默认是3)申请3个Datanode来存放这个Block: 通过 ...
 - Hadoop 2.7.4 HDFS+YRAN HA增加datanode和nodemanager
		
当前集群 主机名称 IP地址 角色 统一安装目录 统一安装用户 sht-sgmhadoopnn-01 172.16.101.55 namenode,resourcemanager /usr/local ...
 - hdfs fsck命令查看HDFS文件对应的文件块信息(Block)和位置信息(Locations)
		
关键字:hdfs fsck.block.locations 在HDFS中,提供了fsck命令,用于检查HDFS上文件和目录的健康状态.获取文件的block信息和位置信息等. fsck命令必须由HDFS ...
 
随机推荐
- python接口自动化测试十九:函数
			
# 函数 a = [1, 3, 6, 4, 85, 32, 46]print(sum(a)) # sum,求和函数 def add(): a = 1, b = 2, return a + bprint ...
 - Myeclipse启动不了的解决方法
			
Myeclipse启动不了的解决方法 我们在开发过程中经常在加载大工程时由于项目很大,导致编译时间很长.或是其他原因导致进度条有时候一直在不停地跑,占用了大量内存,在无奈之下直接将进程kill掉 ...
 - 空指针null
			
Java中,null是一个关键字,用来标识一个不确定的对象.因此可以将null赋给引用类型变量,但不可以将null赋给基本类型变量 Java默认给变量赋值:在定义变量的时候,如果定义后没有给变量赋值, ...
 - 【noip模拟赛7】上网   线性dp
			
描述 假设有n个人要上网,却只有1台电脑可以上网.上网的时间是从1 szw 至 T szw ,szw是sxc,zsx,wl自创的时间单位,至于 szw怎么换算成s,min或h,没有人清楚.依次给出每个 ...
 - Storm中关于Topology的设计
			
一:介绍Storm设计模型 1.Topology Storm对任务的抽象,其实 就是将实时数据分析任务 分解为 不同的阶段 点: 计算组件 Spout Bolt 边: 数据流向 数据从上 ...
 - 为什么NULL指针也能访问成员函数?(但不能访问成员变量)
			
查看更加详细的解析请参考这篇文章:http://blog.51cto.com/9291927/2148695 看一个静态绑定的例子: 1 #include <iostream> 2 3 u ...
 - 你有哪些想要分享的 PyCharm 使用技巧?
			
作者:Light Lin链接:https://www.zhihu.com/question/37787004/answer/75269463来源:知乎著作权归作者所有.商业转载请联系作者获得授权,非商 ...
 - poj 2406 Power Srings (kmp循环节) (经典)
			
<题目链接> 题目大意: 给出一个字符串,求其字串在该字符串中循环的最大周期. 解题分析: length=len-Next[len],len为该字符串的最小循环节,如果len%length ...
 - QT学习之菜单栏与工具栏
			
QT学习之菜单栏与工具栏 目录 简单菜单栏 多级菜单栏 上下菜单栏 工具栏 简单菜单栏 程序示例 from PyQt5.QtWidgets import QApplication, QMainWind ...
 - Linux系统之常用文件搜索命令
			
(一)常用文件搜索命令 (1)which命令 (2)find命令 (3)locate (4)updatedb (5)grep (6)man (7)whatis (一)常用文件搜索命令 (1)which ...