Spark(五十三):Spark RPC初尝试使用
基本用法主要掌握一点就行:
master slave模式运用:driver 就是master,executor就是slave。
如果executor要想和driver交互必须拿到driver的EndpointRef,通过driver的EndpointRef来调接口访问。
driver启动时,会在driver中注册一个Endpoint服务,并暴露自己的ip和端口。executor端生成driver的EndpointRef,就主要需要两个参数就行:driver的host(ip)和port。
导入Maven依赖
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.4.0</version>
</dependency>
定义RPC Server端的ip(localhost)。port(57992)、服务名称(hello-rpc-service)
object HelloRpcSettings {
val rpcName = "hello-rpc-service"
val port = 57992
val hostname="localhost"
def getName() = {
rpcName
}
def getPort(): Int = {
port
}
def getHostname():String={
hostname
}
}
定义RPC的Endpoint类和发送数据类SayHi/SayBye
case class SayHi(msg: String)
case class SayBye(msg: String)
import org.apache.spark.rpc.{RpcCallContext, RpcEndpoint, RpcEnv}
class HelloEndpoint(override val rpcEnv: RpcEnv) extends RpcEndpoint {
override def onStart(): Unit = {
println(rpcEnv.address)
println("start hello endpoint")
}
override def receive: PartialFunction[Any, Unit] = {
case SayHi(msg) =>
println(s"receive $msg" )
}
override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = {
case SayHi(msg) => {
println(s"receive $msg")
context.reply(s"hi, $msg")
}
case SayBye(msg) => {
println(s"receive $msg")
context.reply(s"bye, $msg")
}
}
override def onStop(): Unit = {
println("stop hello endpoint")
}
}
定义RPC 服务提供者
import org.apache.spark.{SecurityManager, SparkConf, SparkContext, SparkEnv}
import org.apache.spark.rpc._
import org.apache.spark.sql.SparkSession
object RpcServerTest {
def main(args: Array[String]): Unit = {
val conf: SparkConf = new SparkConf()
val sparkSession = SparkSession.builder().config(conf).master("local[*]").appName("test rpc").getOrCreate()
val sparkContext: SparkContext = sparkSession.sparkContext
val sparkEnv: SparkEnv = sparkContext.env
val rpcEnv = RpcEnv.create(HelloRpcSettings.getName(), HelloRpcSettings.getHostname(), HelloRpcSettings.getHostname(), HelloRpcSettings.getPort(), conf,
sparkEnv.securityManager, 1, false)
val helloEndpoint: RpcEndpoint = new HelloEndpoint(rpcEnv)
rpcEnv.setupEndpoint(HelloRpcSettings.getName(), helloEndpoint)
rpcEnv.awaitTermination()
}
}
定义RPC服务使用者
import org.apache.spark.{SecurityManager, SparkConf, SparkContext, SparkEnv}
import org.apache.spark.rpc.{RpcAddress, RpcEndpointRef, RpcEnv, RpcEnvConfig}
import org.apache.spark.sql.{Dataset, Row, SparkSession}
import scala.concurrent.duration.Duration
import scala.concurrent.{Await, Future}
object RpcClientTest {
def main(args: Array[String]): Unit = {
val conf: SparkConf = new SparkConf()
val sparkSession = SparkSession.builder().config(conf).master("local[*]").appName("test rpc").getOrCreate()
val sparkContext: SparkContext = sparkSession.sparkContext
val sparkEnv: SparkEnv = sparkContext.env
val rpcEnv: RpcEnv = RpcEnv.create(HelloRpcSettings.getName(),HelloRpcSettings.getHostname(),HelloRpcSettings.getPort(),conf,sparkEnv.securityManager,false)
val endPointRef: RpcEndpointRef = rpcEnv.setupEndpointRef(RpcAddress(HelloRpcSettings.getHostname(), HelloRpcSettings.getPort()), HelloRpcSettings.getName())
import scala.concurrent.ExecutionContext.Implicits.global
endPointRef.send(SayHi("test send"))
val future: Future[String] = endPointRef.ask[String](SayHi("neo"))
future.onComplete {
case scala.util.Success(value) => println(s"Got the result = $value")
case scala.util.Failure(e) => println(s"Got error: $e")
}
Await.result(future, Duration.apply("30s"))
val res = endPointRef.askSync[String](SayBye("test askSync"))
println(res)
sparkSession.stop()
}
}
启动RPC 服务提供者
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
// :: INFO SparkContext: Running Spark version 2.4.
// :: WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
// :: INFO SparkContext: Submitted application: test rpc
// :: INFO SecurityManager: Changing view acls to: boco
// :: INFO SecurityManager: Changing modify acls to: boco
// :: INFO SecurityManager: Changing view acls groups to:
// :: INFO SecurityManager: Changing modify acls groups to:
// :: INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(boco); groups with view permissions: Set(); users with modify permissions: Set(boco); groups with modify permissions: Set()
// :: INFO Utils: Successfully started service 'sparkDriver' on port .
// :: INFO SparkEnv: Registering MapOutputTracker
// :: INFO SparkEnv: Registering BlockManagerMaster
// :: INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
// :: INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
// :: INFO DiskBlockManager: Created local directory at C:\Users\boco\AppData\Local\Temp\blockmgr-7128dde8-9c46--bb72-c2161ba65bf7
// :: INFO MemoryStore: MemoryStore started with capacity 901.8 MB
// :: INFO SparkEnv: Registering OutputCommitCoordinator
// :: INFO Utils: Successfully started service 'SparkUI' on port .
// :: INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://DESKTOP-JL4FSCV:4040
// :: INFO Executor: Starting executor ID driver on host localhost
// :: INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port .
// :: INFO NettyBlockTransferService: Server created on DESKTOP-JL4FSCV:
// :: INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
// :: INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, DESKTOP-JL4FSCV, , None)
// :: INFO BlockManagerMasterEndpoint: Registering block manager DESKTOP-JL4FSCV: with 901.8 MB RAM, BlockManagerId(driver, DESKTOP-JL4FSCV, , None)
// :: INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, DESKTOP-JL4FSCV, , None)
// :: INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, DESKTOP-JL4FSCV, , None)
// :: INFO Utils: Successfully started service 'hello-rpc-service' on port .
localhost:
start hello endpoint
启动RPC 服务使用者
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
// :: INFO SparkContext: Running Spark version 2.4.
// :: WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
// :: INFO SparkContext: Submitted application: test rpc
// :: INFO SecurityManager: Changing view acls to: boco
// :: INFO SecurityManager: Changing modify acls to: boco
// :: INFO SecurityManager: Changing view acls groups to:
// :: INFO SecurityManager: Changing modify acls groups to:
// :: INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(boco); groups with view permissions: Set(); users with modify permissions: Set(boco); groups with modify permissions: Set()
// :: INFO Utils: Successfully started service 'sparkDriver' on port .
// :: INFO SparkEnv: Registering MapOutputTracker
// :: INFO SparkEnv: Registering BlockManagerMaster
// :: INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
// :: INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
// :: INFO DiskBlockManager: Created local directory at C:\Users\boco\AppData\Local\Temp\blockmgr-6a0b8e7f-86d2-4bb8-b45c-7c04deabcb91
// :: INFO MemoryStore: MemoryStore started with capacity 901.8 MB
// :: INFO SparkEnv: Registering OutputCommitCoordinator
// :: WARN Utils: Service 'SparkUI' could not bind on port . Attempting port .
// :: INFO Utils: Successfully started service 'SparkUI' on port .
// :: INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://DESKTOP-JL4FSCV:4041
// :: INFO Executor: Starting executor ID driver on host localhost
// :: INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port .
// :: INFO NettyBlockTransferService: Server created on DESKTOP-JL4FSCV:
// :: INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
// :: INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, DESKTOP-JL4FSCV, , None)
// :: INFO BlockManagerMasterEndpoint: Registering block manager DESKTOP-JL4FSCV: with 901.8 MB RAM, BlockManagerId(driver, DESKTOP-JL4FSCV, , None)
// :: INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, DESKTOP-JL4FSCV, , None)
// :: INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, DESKTOP-JL4FSCV, , None)
// :: WARN Utils: Service 'hello-rpc-service' could not bind on port . Attempting port .
// :: INFO Utils: Successfully started service 'hello-rpc-service' on port .
// :: INFO TransportClientFactory: Successfully created connection to localhost/127.0.0.1: after ms ( ms spent in bootstraps)
bye, test askSync
Got the result = hi, neo
// :: INFO SparkUI: Stopped Spark web UI at http://DESKTOP-JL4FSCV:4041
// :: INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
// :: INFO MemoryStore: MemoryStore cleared
// :: INFO BlockManager: BlockManager stopped
// :: INFO BlockManagerMaster: BlockManagerMaster stopped
// :: INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
// :: INFO SparkContext: Successfully stopped SparkContext
// :: INFO ShutdownHookManager: Shutdown hook called
// :: INFO ShutdownHookManager: Deleting directory
此时 RPC 服务提供者打印信息如下:
receive test send
receive neo
receive test askSync
// :: WARN TransportChannelHandler: Exception in connection from /127.0.0.1:
java.io.IOException: 远程主机强迫关闭了一个现有的连接。
at sun.nio.ch.SocketDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:)
at sun.nio.ch.IOUtil.read(IOUtil.java:)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:)
at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:)
at io.netty.util.concurrent.SingleThreadEventExecutor$.run(SingleThreadEventExecutor.java:)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:)
at java.lang.Thread.run(Thread.java:)
Spark(五十三):Spark RPC初尝试使用的更多相关文章
- Spark学习之路 (五)Spark伪分布式安装
一.JDK的安装 JDK使用root用户安装 1.1 上传安装包并解压 [root@hadoop1 soft]# tar -zxvf jdk-8u73-linux-x64.tar.gz -C /usr ...
- Spark(五) -- Spark Streaming介绍与基本执行过程
Spark Streaming作为Spark上的四大子框架之一,肩负着实时流计算的重大责任 而相对于另外一个当下十分流行的实时流计算处理框架Storm,Spark Streaming有何优点?又有何不 ...
- Spark2.2(三十三):Spark Streaming和Spark Structured Streaming更新broadcast总结(一)
背景: 需要在spark2.2.0更新broadcast中的内容,网上也搜索了不少文章,都在讲解spark streaming中如何更新,但没有spark structured streaming更新 ...
- Spark入门(五)--Spark的reduce和reduceByKey
reduce和reduceByKey的区别 reduce和reduceByKey是spark中使用地非常频繁的,在字数统计中,可以看到reduceByKey的经典使用.那么reduce和reduceB ...
- 【Spark 内核】 Spark 内核解析-上
Spark内核泛指Spark的核心运行机制,包括Spark核心组件的运行机制.Spark任务调度机制.Spark内存管理机制.Spark核心功能的运行原理等,熟练掌握Spark内核原理,能够帮助我们更 ...
- 【Spark 内核】 Spark 内核解析-下
Spark内核泛指Spark的核心运行机制,包括Spark核心组件的运行机制.Spark任务调度机制.Spark内存管理机制.Spark核心功能的运行原理等,熟练掌握Spark内核原理,能够帮助我们更 ...
- 初步了解Spark生态系统及Spark Streaming
一. 场景 ◆ Spark[4]: Scope: a MapReduce-like cluster computing framework designed for low-laten ...
- R语言爬虫初尝试-基于RVEST包学习
注意:这文章是2月份写的,拉勾网早改版了,代码已经失效了,大家意思意思就好,主要看代码的使用方法吧.. 最近一直在用且有维护的另一个爬虫是KINDLE 特价书爬虫,blog地址见此: http://w ...
- 【译】Spark官方文档——Spark Configuration(Spark配置)
注重版权,尊重他人劳动 转帖注明原文地址:http://www.cnblogs.com/vincent-hv/p/3316502.html Spark主要提供三种位置配置系统: 环境变量:用来启动 ...
随机推荐
- linux上SVN出现 "Unable to connect to a repository at URL 'svn://xx.xx.xx.xx/xxx' 和 No repository ...
centos上安装了svn, 有时候会不知道什么原因出现客户端小乌龟无法连接或无法提交等情况. 1. 万能重启,xshell连接服务器,输入 service svnserve restart 命令. ...
- [LeetCode] 543. 二叉树的直径 ☆(递归、数最大深度)
描述 给定一棵二叉树,你需要计算它的直径长度.一棵二叉树的直径长度是任意两个结点路径长度中的最大值.这条路径可能穿过根结点. 示例 :给定二叉树 1 / \ 2 3 / \ 4 5 返回 3, 它的长 ...
- 我的oracle 健康检查报告
最近一直想用sql来生成oracle的健康检查报告,这样看起来一目了然,经过网上搜资料加自己整理终于算是成型了,部分结果如下图所示, 具体参考附件,恳请广大网友看看是否还有需要添加的地方. DB_he ...
- VMware 设置虚拟机Centos 上网的两种方式
能在VMware上面安装虚拟机,不可能说是不让链接外网,只是在自己电脑上玩玩就可以了.因为学习需要,经常在自己笔记本上面搭建虚拟机,我经常使用的两种上网方式 一 NET方式上网 设置VMware Ne ...
- vue.js下移动端绑定click事件失效,pc端正常的问题
原因可能是 我在项目中使用到了 better-scroll,默认它会阻止 touch 事件.所以在配置中需要加上 click: true 即可. 例如: mounted () { this.scrol ...
- 「资料分享」理解uboot要看哪些书
最开始是看的韦东山老师的视频,确实很不错,不过总感觉是不够深入扎实,还是想自己看看书,就总结搜罗下,以供参考 学习交流可以添加 微信读者交流①群 (添加微信:coderAllen) 程序员技术交流①群 ...
- 跨平台的EVENT事件 windows linux(转)
#ifndef _HIK_EVENT_H_ #define _HIK_EVENT_H_ #ifdef _MSC_VER #include <Windows.h> #define hik_e ...
- springboot 整合Swagger2的使用
Swagger2相较于传统Api文档的优点 手写Api文档的几个痛点: 文档需要更新的时候,需要再次发送一份给前端,也就是文档更新交流不及时. 接口返回结果不明确 不能直接在线测试接口,通常需要使用工 ...
- test20190926 孙耀峰
70+100+0=170.结论题自己还是要多试几组小数据.这套题还不错. ZYB建围墙 ZYB之国是特殊的六边形构造. 已知王国一共有
- python高性能编程 读书笔记
GIL 确保 Python 进程一次只能执行一条指令 ====分析工具cProfile 分析函数耗时line_profiler 逐行分析 heapy 追踪 Python 内存中所有的对象— 这对于消 ...