Data Engineering Data Pipeline Outline [DE] How to learn Big Data[了解大数据] [DE] Pipeline for Data Engineering[工作流案例示范] [DE] ML on Big data: MLlib[大数据的机器学习方案] DE基础(厦大) [Spark] 00 - Install Hadoop & Spark[ing] [Spark] 01 - What is Spark[大数据生态库] [Spark]…
1)外部数据源 val distFile1 = sc.textFile("data.txt") //本地当前目录下文件 val distFile2 =sc.textFile("hdfs://192.168.121.12:8020/input/data.txt") //HDFS文件 val distFile3 =sc.textFile("file:/input/data.txt") //本地指定目录下文件 val distFile4 =sc.t…
原文地址:https://my.oschina.net/tearsky/blog/629201 摘要: 1.Operation category READ is not supported in state standby 2.配置spark.deploy.recoveryMode选项为ZOOKEEPER 3.多Master如何配置 4.No Space Left on the device(Shuffle临时文件过多) 5.java.lang.OutOfMemory, unable to cr…