http://192.168.2.51:4041

http://hadoop1:8088/proxy/application_1512362707596_0006/executors/

Executors

Summary

 
  RDD Blocks Storage Memory Disk Used Cores Active Tasks Failed Tasks Complete Tasks Total Tasks Task Time (GC Time) Input Shuffle Read Shuffle Write Blacklisted
Active(3) 54 1.4 GB / 1.2 GB 700.1 MB 2 50 0 22 72 6.5 min (2 s) 0.0 B 0.0 B 0.0 B 0
Dead(0) 0 0.0 B / 0.0 B 0.0 B 0 0 0 0 0 0 ms (0 ms) 0.0 B 0.0 B 0.0 B 0
Total(3) 54 1.4 GB / 1.2 GB 700.1 MB 2 50 0 22 72 6.5 min (2 s) 0.0 B 0.0 B 0.0 B 0
 

Executors

Show 
20
40
60
100
All
 entries
Search:
Executor ID Address Status RDD Blocks Storage Memory Disk Used Cores Active Tasks Failed Tasks Complete Tasks Total Tasks Task Time (GC Time) Input Shuffle Read Shuffle Write Logs Thread Dump
driver 192.168.2.51:52491 Active 2 5.7 KB / 384.1 MB 0.0 B 0 0 0 0 0 0 ms (0 ms) 0.0 B 0.0 B 0.0 B   Thread Dump
2 hadoop2:33018 Active 26 729.5 MB / 384.1 MB 348.1 MB 1 25 0 11 36 2.6 min (1 s) 0.0 B 0.0 B 0.0 B Thread Dump
1 hadoop1:53695 Active 26 700.1 MB / 384.1 MB 352 MB 1 25 0 11 36 3.9 min (0.9 s) 0.0 B 0.0 B 0.0 B Thread Dump
from pyspark.sql import SparkSession

my_spark = SparkSession \
.builder \
.appName("myAppYarn-10g") \
.master('yarn') \
.config("spark.mongodb.input.uri", "mongodb://pyspark_admin:admin123@192.168.2.50/recommendation.article") \
.config("spark.mongodb.output.uri", "mongodb://pyspark_admin:admin123@192.168.2.50/recommendation.article") \
.getOrCreate() db_rows = my_spark.read.format("com.mongodb.spark.sql.DefaultSource").load().collect()

Summary

 
  RDD Blocks Storage Memory Disk Used Cores Active Tasks Failed Tasks Complete Tasks Total Tasks Task Time (GC Time) Input Shuffle Read Shuffle Write Blacklisted
Active(3) 31 748.4 MB / 1.2 GB 75.7 MB 2 27 0 0 27 0 ms (0 ms) 0.0 B 0.0 B 0.0 B 0
Dead(2) 56 1.5 GB / 768.2 MB 790.3 MB 2 0 0 77 77 2.7 h (2 s) 0.0 B 0.0 B 0.0 B 0
Total(5) 87 2.3 GB / 1.9 GB 865.9 MB 4 27 0 77 104 2.7 h (2 s) 0.0 B 0.0 B 0.0 B 0
 

Executors

Show 
20
40
60
100
All
 entries
Search:
Executor ID Address Status RDD Blocks Storage Memory Disk Used Cores Active Tasks Failed Tasks Complete Tasks Total Tasks Task Time (GC Time) Input Shuffle Read Shuffle Write Logs Thread Dump
driver 192.168.2.51:52491 Active 2 5.7 KB / 384.1 MB 0.0 B 0 0 0 0 0 0 ms (0 ms) 0.0 B 0.0 B 0.0 B   Thread Dump
4 hadoop2:34394 Active 12 315.9 MB / 384.1 MB 0.0 B 1 11 0 0 11 0 ms (0 ms) 0.0 B 0.0 B 0.0 B Thread Dump
3 hadoop1:39620 Active 17 432.5 MB / 384.1 MB 75.7 MB 1 16 0 0 16 0 ms (0 ms) 0.0 B 0.0 B 0.0 B Thread Dump
2 hadoop2:33018 Dead 27 758.7 MB / 384.1 MB 390.4 MB 1 0 0 38 38 1.3 h (1 s) 0.0 B 0.0 B 0.0 B Thread Dump
1 hadoop1:53695 Dead 29 775.9 MB / 384.1 MB 399.9 MB 1 0 0 39 39 1.4 h (0.9 s) 0.0 B 0.0 B 0.0 B Thread Dump
Showing 1 to 5 of 5 entries
 
 
Logs for container_1512362707596_0006_02_000002 http://hadoop1:8042/node/containerlogs/container_1512362707596_0006_02_000002/root/stderr?start=-4096
 
 
 
 

Logs for container_1512362707596_0006_02_000002

 

ResourceManager

NodeManager

Tools

Showing 4096 bytes. Click here for full log

Manager: Dropping block taskresult_48 from memory
17/12/04 13:14:32 INFO storage.BlockManager: Writing block taskresult_48 to disk
17/12/04 13:14:32 INFO memory.MemoryStore: After dropping 1 blocks, free memory is 38.5 MB
17/12/04 13:14:32 INFO memory.MemoryStore: Block taskresult_73 stored as bytes in memory (estimated size 32.5 MB, free 6.1 MB)
17/12/04 13:14:32 INFO executor.Executor: Finished task 72.0 in stage 1.0 (TID 73). 34033291 bytes result sent via BlockManager)
17/12/04 13:14:32 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 74
17/12/04 13:14:32 INFO executor.Executor: Running task 73.0 in stage 1.0 (TID 74)
17/12/04 13:14:38 INFO memory.MemoryStore: 1 blocks selected for dropping (16.0 MB bytes)
17/12/04 13:14:38 INFO storage.BlockManager: Dropping block taskresult_50 from memory
17/12/04 13:14:38 INFO storage.BlockManager: Writing block taskresult_50 to disk
17/12/04 13:14:38 INFO memory.MemoryStore: After dropping 1 blocks, free memory is 22.1 MB
17/12/04 13:14:38 INFO memory.MemoryStore: Block taskresult_74 stored as bytes in memory (estimated size 14.4 MB, free 7.7 MB)
17/12/04 13:14:38 INFO executor.Executor: Finished task 73.0 in stage 1.0 (TID 74). 15083225 bytes result sent via BlockManager)
17/12/04 13:14:38 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 75
17/12/04 13:14:38 INFO executor.Executor: Running task 74.0 in stage 1.0 (TID 75)
17/12/04 13:14:46 INFO memory.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 5.2 KB, free 7.7 MB)
17/12/04 13:14:46 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 433.0 B, free 7.7 MB)
17/12/04 13:14:48 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM
17/12/04 13:14:48 ERROR executor.Executor: Exception in task 74.0 in stage 1.0 (TID 75)
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3236)
at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:118)
at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
at org.apache.spark.util.ByteBufferOutputStream.write(ByteBufferOutputStream.scala:41)
at java.io.ObjectOutputStream$BlockDataOutputStream.write(ObjectOutputStream.java:1853)
at java.io.ObjectOutputStream.write(ObjectOutputStream.java:709)
at org.apache.spark.util.Utils$.writeByteBuffer(Utils.scala:239)
at org.apache.spark.scheduler.DirectTaskResult$$anonfun$writeExternal$1.apply$mcV$sp(TaskResult.scala:50)
at org.apache.spark.scheduler.DirectTaskResult$$anonfun$writeExternal$1.apply(TaskResult.scala:48)
at org.apache.spark.scheduler.DirectTaskResult$$anonfun$writeExternal$1.apply(TaskResult.scala:48)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1303)
at org.apache.spark.scheduler.DirectTaskResult.writeExternal(TaskResult.scala:48)
at java.io.ObjectOutputStream.writeExternalData(ObjectOutputStream.java:1459)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1430)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43)
at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:403)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
17/12/04 13:14:48 INFO connection.MongoClientCache: Closing MongoClient: [192.168.2.50:27017]
17/12/04 13:14:48 INFO driver.connection: Closed connection [connectionId{localValue:4, serverValue:42}] to 192.168.2.50:27017 because the pool has been closed.
 
 
 

spark 33G表的更多相关文章

  1. 基于spark实现表的join操作

    1. 自连接 假设存在如下文件: [root@bluejoe0 ~]# cat categories.csv 1,生活用品,0 2,数码用品,1 3,手机,2 4,华为Mate7,3 每一行的格式为: ...

  2. 利用spark将表中数据拆分

    i# coding:utf-8from pyspark.sql import SparkSession import os if __name__ == '__main__': os.environ[ ...

  3. spark使用Hive表操作

    spark Hive表操作 之前很长一段时间是通过hiveServer操作Hive表的,一旦hiveServer宕掉就无法进行操作. 比如说一个修改表分区的操作 一.使用HiveServer的方式 v ...

  4. Databricks 第6篇:Spark SQL 维护数据库和表

    Spark SQL 表的命名方式是db_name.table_name,只有数据库名称和数据表名称.如果没有指定db_name而直接引用table_name,实际上是引用default 数据库下的表. ...

  5. Spark SQL概念学习系列之如何使用 Spark SQL(六)

    val sqlContext = new org.apache.spark.sql.SQLContext(sc) // 在这里引入 sqlContext 下所有的方法就可以直接用 sql 方法进行查询 ...

  6. spark基础知识介绍2

    dataframe以RDD为基础的分布式数据集,与RDD的区别是,带有Schema元数据,即DF所表示的二维表数据集的每一列带有名称和类型,好处:精简代码:提升执行效率:减少数据读取; 如果不配置sp ...

  7. 新手福利:Apache Spark入门攻略

    [编者按]时至今日,Spark已成为大数据领域最火的一个开源项目,具备高性能.易于使用等特性.然而作为一个年轻的开源项目,其使用上存在的挑战亦不可为不大,这里为大家分享SciSpike软件架构师Ash ...

  8. Spark入门之DataFrame/DataSet

    目录 Part I. Gentle Overview of Big Data and Spark Overview 1.基本架构 2.基本概念 3.例子(可跳过) Spark工具箱 1.Dataset ...

  9. 6.3 使用Spark SQL读写数据库

    Spark SQL可以支持Parquet.JSON.Hive等数据源,并且可以通过JDBC连接外部数据源 一.通过JDBC连接数据库 1.准备工作 ubuntu安装mysql教程 在Linux中启动M ...

随机推荐

  1. 九度oj 题目1153:括号匹配问题

    题目描述: 在某个字符串(长度不超过100)中有左括号.右括号和大小写字母:规定(与常见的算数式子一样)任何一个左括号都从内到外与在它右边且距离最近的右括号匹配.写一个程序,找到无法匹配的左括号和右括 ...

  2. Android自制rom,为update.zip签名

    确认已经安装好openssl openssl genrsa -out key.pem openssl req -new -key key.pem -out request.pem openssl x5 ...

  3. mysqlbinlog备份和mysqldump备份

    -bash : mysqldump: command not found -bash : mysqlbinlog:command not found 首先得知道mysql命令或mysqldump命令的 ...

  4. Terracotta

    Terracotta 3.2.1简介 (一) 博客分类: 企业应用面临的问题 Java&Socket 开源组件的应用 hibernatejava集群服务器EhcacheQuartzTerrac ...

  5. 【Luogu】P1330封锁阳光大学(bfs染色)

    题目链接 这题恶心死我了. bfs染色,统计每个联通块两色的个数,ans加它们的最小值. #include<cstdio> #include<cctype> #include& ...

  6. 刷题总结——bzoj1725(状压dp)

    题目: 题目描述 Farmer John 新买了一块长方形的牧场,这块牧场被划分成 N 行 M 列(1<=M<=12; 1<=N<=12),每一格都是一块正方形的土地. FJ  ...

  7. Bichrome Tree

    Bichrome Tree 时间限制: 1 Sec  内存限制: 128 MB 题目描述 We have a tree with N vertices. Vertex 1 is the root of ...

  8. HDU4768:Flyer [ 二分的奇妙应用 好题 ]

    传送门 Flyer Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 32768/32768 K (Java/Others)Total S ...

  9. 三种Model模式

    目前项目中可能出现的三种Model模式,对于我们现在开发的一个项目,我觉得使用DDD的思想来设计模型比较清晰,使用DDD的思想把模型model分成了如下三种:ViewModel,它与页面相关,Doma ...

  10. 2017-10-28-afternoon-清北模拟赛

    T1 水题(water) Time Limit:1000ms   Memory Limit:128MB 题目描述 LYK出了道水题. 这个水题是这样的:有两副牌,每副牌都有n张. 对于第一副牌的每张牌 ...