1. xml-apis 冲突问题

javax.xml.parsers.FactoryConfigurationError: Provider for class javax.xml.parsers.DocumentBuilderFactory cannot be created

java.lang.RuntimeException: Provider for class javax.xml.parsers.DocumentBuilderFactory cannot be created

flink如果要操作hdfs,需要一个hadoop-hdfs的包.

如果项目里面有用dom4j相关包,就会与之有冲突,出现如下错误.

06/05/2018 11:40:40     Job execution switched to status FAILING.
javax.xml.parsers.FactoryConfigurationError: Provider for class javax.xml.parsers.DocumentBuilderFactory cannot be created
at javax.xml.parsers.FactoryFinder.findServiceProvider(FactoryFinder.java:311)
at javax.xml.parsers.FactoryFinder.find(FactoryFinder.java:267)
at javax.xml.parsers.DocumentBuilderFactory.newInstance(DocumentBuilderFactory.java:120)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2516)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2492)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2405)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:981)
at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1031)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2189)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654)
at org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:99)
at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:401)
at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:320)
at org.apache.flink.core.fs.Path.getFileSystem(Path.java:293)
at org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory.<init>(FsCheckpointStreamFactory.java:99)
at org.apache.flink.runtime.state.filesystem.FsStateBackend.createStreamFactory(FsStateBackend.java:277)
at org.apache.flink.streaming.runtime.tasks.StreamTask.createCheckpointStreamFactory(StreamTask.java:787)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:246)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Provider for class javax.xml.parsers.DocumentBuilderFactory cannot be created
at javax.xml.parsers.FactoryFinder.findServiceProvider(FactoryFinder.java:308)
... 20 more

解决方案如下:

原因就是xml-apis包与dom4j冲突了

把所有hadoop相关包的xml-apis依赖全部排除了

(顺便吐槽一下网上,有的说加这个包,有的说加哪个包,最后应该是全部排除xml-apis的依赖才对)

这里只是个样例,可以使用mvn tree看看依赖,全部排除干净才行

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
version>2.7.1</version>
<exclusions>
<exclusion>
<groupId>xml-apis</groupId>
<artifactId>xml-apis</artifactId>
</exclusion>
</exclusions>
</dependency>

  1. checkpoint 路径问题

Could not flush and close the file system output stream *** in order to obtain the stream state handle

java.lang.Exception: Could not materialize checkpoint 1 for operator

这类问题有2个地方需要修改

1,checkpoint地址

2,state.backend: filesystem

如果flink集群在配置中,配置了默认checkpoint地址.

那么flink程序在设置checkpoint的时候,要与集群配置的地址一致.

错误现象如下:

java.lang.Exception: Could not perform checkpoint 1 for operator abcSink1 -> Sink: abcSink2 (4/5).
at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:569)
at org.apache.flink.streaming.runtime.io.BarrierBuffer.notifyCheckpoint(BarrierBuffer.java:380)
at org.apache.flink.streaming.runtime.io.BarrierBuffer.processBarrier(BarrierBuffer.java:283)
at org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:185)
at org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:214)
at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:69)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:264)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.Exception: Could not complete snapshot 1 for operator abcSink1 -> Sink: abcSink2 (4/5).
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:378)
at org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.checkpointStreamOperator(StreamTask.java:1089)
at org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.executeCheckpointing(StreamTask.java:1038)
at org.apache.flink.streaming.runtime.tasks.StreamTask.checkpointState(StreamTask.java:671)
at org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:607)
at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:560)
... 8 more
Caused by: java.io.IOException: Could not flush and close the file system output stream to hdfs://localhost:8020/data/flink/checkpoint/2c418a966ed84cb035bb0574cb7c8d13/chk-1/989695d0-f3d8-40a0-addf-e4d2cc367794 in order to obtain the stream state handle
at org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.closeAndGetHandle(FsCheckpointStreamFactory.java:336)
at org.apache.flink.runtime.state.KeyedStateCheckpointOutputStream.closeAndGetHandle(KeyedStateCheckpointOutputStream.java:105)
at org.apache.flink.runtime.state.KeyedStateCheckpointOutputStream.closeAndGetHandle(KeyedStateCheckpointOutputStream.java:30)
at org.apache.flink.runtime.state.StateSnapshotContextSynchronousImpl.closeAndUnregisterStreamToObtainStateHandle(StateSnapshotContextSynchronousImpl.java:126)
at org.apache.flink.runtime.state.StateSnapshotContextSynchronousImpl.getKeyedStateStreamFuture(StateSnapshotContextSynchronousImpl.java:113)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:358)
... 13 more
Caused by: java.io.IOException: DataStreamer Exception:
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:562)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hdfs.protocol.HdfsConstants
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1262)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:448)

解决方案如下:

flink checkpoint 集群配置地址与 job的地址不一致,导致的问题

配置文件中的地址 state.backend.fs.checkpointdir: 地址

代码中的地址 env.setStateBackend(new FsStateBackend(地址))

这两个地方地址要一致

并且还要修改配置文件中的state.backend: 参数

默认是内存,如果使用hdfs得写成filesystem


  1. oom问题

java.lang.OutOfMemoryError: GC overhead limit exceeded

java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOfRange(Arrays.java:3664)
at java.lang.String.<init>(String.java:207)
at com.esotericsoftware.kryo.io.Input.readString(Input.java:466)
at com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:177)
at com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:166)
at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:730)
at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:113)
at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:528)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:657)
at org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.copy(KryoSerializer.java:189)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:547)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:524)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:504)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:830)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:808)
at org.apache.flink.streaming.api.operators.StreamFilter.processElement(StreamFilter.java:40)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:549)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:524)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:504)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:830)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:808)
at org.apache.flink.streaming.api.operators.StreamFilter.processElement(StreamFilter.java:40)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:549)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:524)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:504)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$BroadcastingOutputCollector.collect(OperatorChain.java:611)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$BroadcastingOutputCollector.collect(OperatorChain.java:572)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:830)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:808)
at org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:41)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:549)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:524)

解决方案如下:

1.检查slot槽位够不够

2.程序起的并行是否都正常分配了(会有这样的情况出现,假如5个并行,但是只有2个在几点上生效了,另外3个没有数据流动)

3.集群资源够不够

flink 常见问题整理的更多相关文章

  1. Maven使用常见问题整理

    Maven使用常见问题整理  1.更新eclipse的classpath加入新依赖  1.在dependencyManagement里面加入包括版本在内的依赖信息,如:   <dependenc ...

  2. LoadRunner常见问题整理(转)

    首先要感谢群友的无私分享,才能得到这篇好的学习资料,整理得太好了,所以收藏保存,方便以后学习. 一:LoadRunner常见问题整理 1.LR 脚本为空的解决方法: 1.去掉ie设置中的第三方支持取消 ...

  3. [转]LoadRunner脚本录制常见问题整理

    LoadRunner脚本录制常见问题整理 1.LoadRunner录制脚本时为什么不弹出IE浏览器? 当一台主机上安装多个浏览器时,LoadRunner录制脚本经常遇到不能打开浏览器的情况,可以用下面 ...

  4. [转帖]kubernetes 常见问题整理

    kubernetes 常见问题整理 https://www.cnblogs.com/qingfeng2010/p/10642408.html 使用kubectl 命令报错 报错: [root@k8s- ...

  5. LR常见问题整理

    1.LoadRunner录制脚本时为什么不弹出IE浏览器? 当一台主机上安装多个浏览器时,LoadRunner录制脚本经常遇到不能打开浏览器的情况,可以用下面的方法来解决. LR11 无法弹出ie浏览 ...

  6. Git 常见问题整理

    在学习git的过程中,遇到如下问题,特整理如下: 1 error:src refspec master does not match any 问题产生 a git服务器使用如下命令新建一个项目 $ c ...

  7. web标准常见问题整理

    1.超链接访问过后hover样式就不出现的问题 2.FF下如何使连续长字段自动换行 3.ff下为什么父容器的高度不能自适应 4. IE6的双倍边距BUG 5. IE6下绝对定位的容器内文本无法正常选择 ...

  8. Microsoft Mole原理及常见问题整理

     Moles与Moq(Rhino.Mocks)比较 作用范围 Moq与Rhino.Mocks这类的Mock是对Interface或AbstractClass做Mock, 而Moles是Mock整个 ...

  9. 手机移动端web前端常见问题整理

    移动端常见问题及解决方案 一.meta基础知识 H5页面窗口自动调整到设备宽度,并禁止用户缩放页面 <meta name="viewport" content="w ...

随机推荐

  1. 网络 OSI参考模型与TCP/IP模型

    ISO是国际标准化组织.OSI,开放互联系统.IOS,思科交换机和路由器的操作系统. TCP/IP模型是OSI模型的简化.所有的互联网协议都是基于OSI模型开发的. 分层:便于管理,每层只管理下层,总 ...

  2. 为什么不要使用 select * from xxx (oracle 亲测)

    打开已用时间set timing on;create table users(id number(20), name varchar2(20), password varchar2(20));inse ...

  3. @transient加在属性前的作用

    我们都知道一个对象只要实现了Serilizable接口,这个对象就可以被序列化,java的这种序列化模式为开发者提供了很多便利,我们可以不必关系具体序列化的过程,只要这个类实现了Serilizable ...

  4. 6.Spring MVC SSM整合问题总结

    1.Cannot find class [org.springframework.http.converter.json.MappingJacksonHttpMessageConverter] for ...

  5. Kubernetes简述

    一.Kubernetes特性 1.自动装箱 建构于容器之上,基于资源依赖及其他约束自动完成容器部署且不影响其可用性,并通过调度机制混合关键型应用和非关键型应用的工作负载于一点以提高资源利用率. 2.自 ...

  6. Entity Framework 6.X实现记录执行的SQL功能

    Entity Framework在使用时,很多时间操纵的是Model,并没有写sql语句,有时候为了调试或优化等,又需要追踪Entity framework自动生成的sql(最好还能记录起来,方便出错 ...

  7. [翻译] IDMPhotoBrowser

    IDMPhotoBrowser IDMPhotoBrowser is a new implementation based on MWPhotoBrowser. IDMPhotoBrowser实现了图 ...

  8. 关于第一场HBCTF的Web题小分享,当作自身的笔记

    昨天晚上6点开始的HBCTF,虽然是针对小白的,但有些题目确实不简单. 昨天女朋友又让我帮她装DOTA2(女票是一个不怎么用电脑的),然后又有一个小白问我题目,我也很热情的告诉她了,哎,真耗不起. 言 ...

  9. 那些不明不白的$符号设计--Sass和Emmet,变量设计原理相通

    以前看到php变量的定义,直接使用$符号开始,怎么看都不习惯.后来呀,在使用Emmet的过程中,又接触到了$符号.今天,在学习Sass的过程种,再一次接触到$符号,兴致所致,不由得想写一篇,对比一下搞 ...

  10. 超链接<a>标签用法

    1.a标签点击事件 1>1a href="javascript:js_method();" 这是我们平台上常用的方法,但是这种方法在传递this等参数的时候很容易出问题,而且 ...