最近在hdfs写文件的时候发现一个问题,create写入正常,append写入报错,每次都能重现,代码示例如下:

        FileSystem fs = FileSystem.get(conf);
OutputStream out = fs.create(file);
IOUtils.copyBytes(in, out, 4096, true); //正常
out = fs.append(file);
IOUtils.copyBytes(in, out, 4096, true); //报错

通过hdfs fsck命令检查出问题的文件,发现只有一个副本,难道是因为这个?

看FileSystem.append执行过程:

org.apache.hadoop.fs.FileSystem

    public abstract FSDataOutputStream append(Path var1, int var2, Progressable var3) throws IOException;

实现类在这里:

org.apache.hadoop.hdfs.DistributedFileSystem

    public FSDataOutputStream append(Path f, final int bufferSize, final Progressable progress) throws IOException {
this.statistics.incrementWriteOps(1);
Path absF = this.fixRelativePart(f);
return (FSDataOutputStream)(new FileSystemLinkResolver<FSDataOutputStream>() {
public FSDataOutputStream doCall(Path p) throws IOException, UnresolvedLinkException {
return DistributedFileSystem.this.dfs.append(DistributedFileSystem.this.getPathName(p), bufferSize, progress, DistributedFileSystem.this.statistics);
} public FSDataOutputStream next(FileSystem fs, Path p) throws IOException {
return fs.append(p, bufferSize);
}
}).resolve(this, absF);
}

这里会调用DFSClient.append方法

org.apache.hadoop.hdfs.DFSClient

    private DFSOutputStream append(String src, int buffersize, Progressable progress) throws IOException {
this.checkOpen();
DFSOutputStream result = this.callAppend(src, buffersize, progress);
this.beginFileLease(result.getFileId(), result);
return result;
} private DFSOutputStream callAppend(String src, int buffersize, Progressable progress) throws IOException {
LocatedBlock lastBlock = null; try {
lastBlock = this.namenode.append(src, this.clientName);
} catch (RemoteException var6) {
throw var6.unwrapRemoteException(new Class[]{AccessControlException.class, FileNotFoundException.class, SafeModeException.class, DSQuotaExceededException.class, UnsupportedOperationException.class, UnresolvedPathException.class, SnapshotAccessControlException.class});
} HdfsFileStatus newStat = this.getFileInfo(src);
return DFSOutputStream.newStreamForAppend(this, src, buffersize, progress, lastBlock, newStat, this.dfsClientConf.createChecksum());
}

DFSClient.append最终会调用NameNodeRpcServer的append方法

org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer

    public LocatedBlock append(String src, String clientName) throws IOException {
this.checkNNStartup();
String clientMachine = getClientMachine();
if (stateChangeLog.isDebugEnabled()) {
stateChangeLog.debug("*DIR* NameNode.append: file " + src + " for " + clientName + " at " + clientMachine);
} this.namesystem.checkOperation(OperationCategory.WRITE);
LocatedBlock info = this.namesystem.appendFile(src, clientName, clientMachine);
this.metrics.incrFilesAppended();
return info;
}

这里调用到FSNamesystem.append

org.apache.hadoop.hdfs.server.namenode.FSNamesystem

    LocatedBlock appendFile(String src, String holder, String clientMachine) throws AccessControlException, SafeModeException,
...
lb = this.appendFileInt(src, holder, clientMachine, cacheEntry != null); private LocatedBlock appendFileInt(String srcArg, String holder, String clientMachine, boolean logRetryCache) throws
...
lb = this.appendFileInternal(pc, src, holder, clientMachine, logRetryCache); private LocatedBlock appendFileInternal(FSPermissionChecker pc, String src, String holder, String clientMachine, boolean logRetryCache) throws AccessControlException, UnresolvedLinkException, FileNotFoundException, IOException {
assert this.hasWriteLock(); INodesInPath iip = this.dir.getINodesInPath4Write(src);
INode inode = iip.getLastINode();
if (inode != null && inode.isDirectory()) {
throw new FileAlreadyExistsException("Cannot append to directory " + src + "; already exists as a directory.");
} else {
if (this.isPermissionEnabled) {
this.checkPathAccess(pc, src, FsAction.WRITE);
} try {
if (inode == null) {
throw new FileNotFoundException("failed to append to non-existent file " + src + " for client " + clientMachine);
} else {
INodeFile myFile = INodeFile.valueOf(inode, src, true);
BlockStoragePolicy lpPolicy = this.blockManager.getStoragePolicy("LAZY_PERSIST");
if (lpPolicy != null && lpPolicy.getId() == myFile.getStoragePolicyID()) {
throw new UnsupportedOperationException("Cannot append to lazy persist file " + src);
} else {
this.recoverLeaseInternal(myFile, src, holder, clientMachine, false);
myFile = INodeFile.valueOf(this.dir.getINode(src), src, true);
BlockInfo lastBlock = myFile.getLastBlock();
if (lastBlock != null && lastBlock.isComplete() && !this.getBlockManager().isSufficientlyReplicated(lastBlock)) {
throw new IOException("append: lastBlock=" + lastBlock + " of src=" + src + " is not sufficiently replicated yet.");
} else {
return this.prepareFileForWrite(src, iip, holder, clientMachine, true, logRetryCache);
}
}
}
} catch (IOException var11) {
NameNode.stateChangeLog.warn("DIR* NameSystem.append: " + var11.getMessage());
throw var11;
}
}
} public boolean isSufficientlyReplicated(BlockInfo b) {
int replication = Math.min(this.minReplication, this.getDatanodeManager().getNumLiveDataNodes());
return this.countNodes(b).liveReplicas() >= replication;
}

在append文件的时候,会首先取出这个文件最后一个block,然后会检查这个block是否满足副本要求,如果不满足就抛出异常,如果满足就准备写入;
看来原因确实是因为文件只有1个副本导致append报错,那为什么新建文件只有1个副本,后来找到原因是因为机架配置有问题导致的,详见 https://www.cnblogs.com/barneywill/p/10114504.html

【原创】大叔问题定位分享(20)hdfs文件create写入正常,append写入报错的更多相关文章

  1. 【报错】spring整合activeMQ,pom.xml文件缺架包,启动报错:Caused by: java.lang.ClassNotFoundException: org.apache.xbean.spring.context.v2.XBeanNamespaceHandler

    spring版本:4.3.13 ActiveMq版本:5.15 ======================================================== spring整合act ...

  2. (未解决)flume监控目录,抓取文件内容推送给kafka,报错

    flume监控目录,抓取文件内容推送给kafka,报错: /export/datas/destFile/220104_YT1013_8c5f13f33c299316c6720cc51f94f7a0_2 ...

  3. 【原创】大叔问题定位分享(5)Kafka客户端报错SocketException: Too many open files 打开的文件过多

    kafka0.8.1 一 问题 10月22号应用系统忽然报错: [2014/12/22 11:52:32.738]java.net.SocketException: 打开的文件过多 [2014/12/ ...

  4. 【原创】大叔问题定位分享(13)HBase Region频繁下线

    问题现象:hive执行sql报错 select count(*) from test_hive_table; 报错 Error: java.io.IOException: org.apache.had ...

  5. 【原创】大叔问题定位分享(3)Kafka集群broker进程逐个报错退出

    kafka0.8.1 一 问题现象 生产环境kafka服务器134.135.136分别在10月11号.10月13号挂掉: 134日志 [2014-10-13 16:45:41,902] FATAL [ ...

  6. 【原创】大叔问题定位分享(32)mysql故障恢复

    mysql启动失败,一直crash,报错如下: 2019-03-14T11:15:12.937923Z 0 [Note] InnoDB: Uncompressed page, stored check ...

  7. 【原创】大叔问题定位分享(30)mesos agent启动失败:Failed to perform recovery: Incompatible agent info detected

    mesos agent启动失败,报错如下: Feb 15 22:03:18 server1.bj mesos-slave[1190]: E0215 22:03:18.622994 1192 slave ...

  8. 【原创】大叔问题定位分享(28)openssh升级到7.4之后ssh跳转异常

    服务器集群之间忽然ssh跳转不通 # ssh 192.168.0.1The authenticity of host '192.168.0.1 (192.168.0.1)' can't be esta ...

  9. 【原创】大叔问题定位分享(25)ambari metrics collector内置standalone hbase启动失败

    ambari metrics collector内置hbase目录位于 /usr/lib/ams-hbase 配置位于 /etc/ams-hbase/conf 通过ruby启动 /usr/lib/am ...

随机推荐

  1. try.dot.net 的正确使用姿势

    [简介] 微软官方前不久发布了 try.dot.net 这个有趣的网址,开始只是图个新鲜看了一下,后面通过自身实践过后,发现这着实算是个“有趣”的站点! 首先我们大概地列举一下这个站点能给我们带来什么 ...

  2. Logstash filter 插件之 grok

    本文简单介绍一下 Logstash 的过滤插件 grok. Grok 的主要功能 Grok 是 Logstash 最重要的插件.它可以解析任意文本并把它结构化.因此 Grok 是将非结构化的日志数据解 ...

  3. Xshell 连接Linux服务器自动中断问题

    Xshell连接上Linux服务器后经常自动中断连接,报错如下图: 解决方法如下,进入/etc/ssh目录打开sshd_config文件,找到下图两个参数并设置下图所示的值: 重启sshd即可解决,如 ...

  4. 接口请求,上传byte数组byte[]数据异常,负数变正数/负数变63

    一.背景 最近项目中有个需求,就是需要把一个byte[]数组上传到服务端.但是我发现发送的byte数组和服务端接收的数组不一样,所有的正数在传递时正确,数组长度也没变化,但是负数变成了63或者负数全部 ...

  5. java从Swagger Api接口获取数据工具类

  6. 关于sha1加密与md5加密

    1.区别 Hash,一般翻译做"散列",也有直接音译为"哈希"的,就是把任意长度的输入,变换成固定长度的输出,该输出就是散列值.这种转换是一种压缩映射,也就是, ...

  7. CF1120D(神奇的构造+最小生成树)

    考虑把树展开,单独把叶子节点拿出来 于是可以把控制点\(x\)的,抽象成是在它叶子节点间连权值为\(c_x\)的边 显然只用在\(x\)子树的最左边的叶子节点和最右边的叶子节点的下一个节点连边(最后一 ...

  8. Apache Flink:特性、概念、组件栈、架构及原理分析

     2016-04-30 22:24:39    Yanjun Apache Flink是一个面向分布式数据流处理和批量数据处理的开源计算平台,它能够基于同一个Flink运行时(Flink Runtim ...

  9. Luogu P3227 [HNOI2013]切糕 最小割

    首先推荐一个写的很好的题解,个人水平有限只能写流水账,还请见谅. 经典的最小割模型,很多人都说这个题是水题,但我还是被卡了=_= 技巧:加边表示限制 在没有距离\(<=d\)的限制时候,我们对每 ...

  10. 基于USB网卡适配器劫持DHCP Server嗅探Windows NTLM Hash密码

    catalogue . DHCP.WPAD工作过程 . python Responder . USB host/client adapter(USB Armory): 包含DHCP Server . ...