Akka源码分析-local-DeathWatch

　　生命周期监控，也就是死亡监控，是akka编程中常用的机制。比如我们有了某个actor的ActorRef之后，希望在该actor死亡之后收到响应的消息，此时我们就可以使用watch函数达到这一目的。

class WatchActor extends Actor {

  val child = context.actorOf(Props.empty, "child")

  context.watch(child) // <-- this is the only call needed for registration

  var lastSender = context.system.deadLetters

  def receive = {

    case "kill" ⇒

      context.stop(child); lastSender = sender()

    case Terminated(`child`) ⇒ lastSender ! "finished"

  }

}

　　我们从官网的一个例子入手，其实DeathWatch用起来还是非常方便的，就是调用context.watch，在对应的actor由于某种原因stop之后，就会收到Terminated消息，该消息只有一个参数，那就是stop的ActorRef。看起来简单，那具体是怎么实现的呢？

  /**

   * Registers this actor as a Monitor for the provided ActorRef.

   * This actor will receive a Terminated(subject) message when watched

   * actor is terminated.

   *

   * `watch` is idempotent if it is not mixed with `watchWith`.

   *

   * It will fail with an [[IllegalStateException]] if the same subject was watched before using `watchWith`.

   * To clear the termination message, unwatch first.

   *

   * *Warning*: This method is not thread-safe and must not be accessed from threads other

   * than the ordinary actor message processing thread, such as [[java.util.concurrent.CompletionStage]] and [[scala.concurrent.Future]] callbacks.

   *

   * @return the provided ActorRef

   */

  def watch(subject: ActorRef): ActorRef

　　上面是ActorContex关于watch的官方注释，非常简单，就是watch一个actor，然后就会收到对应的Terminated消息，还说这个方法不是线程安全的。

　　如果读者看过我之前的源码分析文章的话，一定知道context就是ActorContext的实例，而ActorContext是ActorCell的一个功能截面，那么watch函数的具体实现应该就是在ActorCell里面了。由于ActorCell实现的接口比较多，就不再具体分析如何找到watch实现在哪个类了，直接告诉答案：dungeon.DeathWatch。

private[akka] trait DeathWatch { this: ActorCell ⇒

　　首先它是一个自我类型限定的trait，这种方式我之前吐槽过这里就不展开说了，来看看watch如何实现的。

override final def watch(subject: ActorRef): ActorRef = subject match {

    case a: InternalActorRef ⇒

      if (a != self) {

        if (!watchingContains(a))

          maintainAddressTerminatedSubscription(a) {

            a.sendSystemMessage(Watch(a, self)) // ➡➡➡ NEVER SEND THE SAME SYSTEM MESSAGE OBJECT TO TWO ACTORS ⬅⬅⬅

            updateWatching(a, None)

          }

        else

          checkWatchingSame(a, None)

      }

      a

  }

　　从上面源码可以分析出几个简单的技术点：1、不能watch自身；2、如果已经被监控则调用checkWatchingSame；3、没有被监控过，就给被监控的actor发送Watch整个系统消息；4、没有监控过则更新监控信息。

/**

   * This map holds a [[None]] for actors for which we send a [[Terminated]] notification on termination,

   * ``Some(message)`` for actors for which we send a custom termination message.

   */

  private var watching: Map[ActorRef, Option[Any]] = Map.empty

  //   when all actor references have uid, i.e. actorFor is removed

  private def watchingContains(subject: ActorRef): Boolean =

    watching.contains(subject) || (subject.path.uid != ActorCell.undefinedUid &&

      watching.contains(new UndefinedUidActorRef(subject)))

　　判断是否已经监控过，这个具体实现比较有意思，watching是一个Map，首先判断Map中是否需包含该ActorRef；如果不包含该ActorRef，就去判断有没有UID，有UID则创建一个UndefinedUidActorRef，再去watching中判断是否包含。难道不奇怪么？既然都不包含了，创建一个UndefinedUidActorRef就有可能包含了？谁说不是呢，哈哈。其实也不是。我们来看看ActorRef是如何定义equals的。

/**

   * Equals takes path and the unique id of the actor cell into account.

   */

  final override def equals(that: Any): Boolean = that match {

    case other: ActorRef ⇒ path.uid == other.path.uid && path == other.path

    case _               ⇒ false

  }

　　上面源码逻辑比较清晰，如果两个ActorRef相等，则一定是path相等，且对应的uid相等。ActorPath的判等就不再分析了，肯定是各个层次相同喽。

　　那么有没有可能path相同，而uid不同呢？当然可能了，如果一个actor被stop之后，再用相同的actorOf参数创建呢？此时uid是不同的，而path是相同的。

private[akka] class UndefinedUidActorRef(ref: ActorRef) extends MinimalActorRef {

  override val path = ref.path.withUid(ActorCell.undefinedUid)

  override def provider = throw new UnsupportedOperationException("UndefinedUidActorRef does not provide")

}

　　UndefinedUidActorRef就是与原ActorRef路径相同，而uid是ActorCell.undefinedUid的一个新的ActorRef。

　　maintainAddressTerminatedSubscription，它会判断是不是本地actor，如果是本地actor则调用后面的block，对于远程actor会有一些特殊操作，这里不再分析。

  private def updateWatching(ref: InternalActorRef, newMessage: Option[Any]): Unit =

    watching = watching.updated(ref, newMessage)

　　updateWatching比较简单，就是把要watch的actorRef插入到watching这个Map中去。你要问我这个ActorRef在Map中对应的value是啥，我也是拒绝回答的，你可以看看watchWith的用法，这里不再分析。下面我们来分析一下被监控的Actor收到Watching之后是如何做响应的。

case Watch(watchee, watcher) ⇒ addWatcher(watchee, watcher)

　　它命中了ActorCell.systemInvoke中的以上分支。

protected def addWatcher(watchee: ActorRef, watcher: ActorRef): Unit = {

    val watcheeSelf = watchee == self

    val watcherSelf = watcher == self

    if (watcheeSelf && !watcherSelf) {

      if (!watchedBy.contains(watcher)) maintainAddressTerminatedSubscription(watcher) {

        watchedBy += watcher

        if (system.settings.DebugLifecycle) publish(Debug(self.path.toString, clazz(actor), s"now watched by $watcher"))

      }

    } else if (!watcheeSelf && watcherSelf) {

      watch(watchee)

    } else {

      publish(Warning(self.path.toString, clazz(actor), "BUG: illegal Watch(%s,%s) for %s".format(watchee, watcher, self)))

    }

  }

　　正常情况下，会命中第一个if的第一个分支的代码，其实也比较简答，就是去watchedBy里面查找是否保存过watcher，如果没有就把它加到watchedBy里面。

private var watchedBy: Set[ActorRef] = ActorCell.emptyActorRefSet

　　watchedBy是一个set，也就是里面的ActorRef不重复。那如果这个actor被stop之后，啥时候通知对应的watchedBy呢？这个问题其实还是满复杂的。

　　如果想知道什么时候通知了watchedBy，就需要知道stop的逻辑，那么ActorCell的stop是如何实现的呢？

// ➡➡➡ NEVER SEND THE SAME SYSTEM MESSAGE OBJECT TO TWO ACTORS ⬅⬅⬅

  final def stop(): Unit = try dispatcher.systemDispatch(this, Terminate()) catch handleException

　　stop在Dispatch这个trait里面实现，很简单，它又用当前dispatcher发送了一个Terminate消息给自己。

case Terminate() ⇒ terminate()

　　收到Terminate消息后，调用了terminate方法。

protected def terminate() {

    setReceiveTimeout(Duration.Undefined)

    cancelReceiveTimeout

    // prevent Deadletter(Terminated) messages

    unwatchWatchedActors(actor)

    // stop all children, which will turn childrenRefs into TerminatingChildrenContainer (if there are children)

    children foreach stop

    if (systemImpl.aborting) {

      // separate iteration because this is a very rare case that should not penalize normal operation

      children foreach {

        case ref: ActorRefScope if !ref.isLocal ⇒ self.sendSystemMessage(DeathWatchNotification(ref, true, false))

        case _                                  ⇒

      }

    }

    val wasTerminating = isTerminating

    if (setChildrenTerminationReason(ChildrenContainer.Termination)) {

      if (!wasTerminating) {

        // do not process normal messages while waiting for all children to terminate

        suspendNonRecursive()

        // do not propagate failures during shutdown to the supervisor

        setFailed(self)

        if (system.settings.DebugLifecycle) publish(Debug(self.path.toString, clazz(actor), "stopping"))

      }

    } else {

      setTerminated()

      finishTerminate()

    }

  }

　　terminate方法，逻辑清晰，它会通知子actor进行stop。那么子actor是如何stop的呢？

final def stop(actor: ActorRef): Unit = {

    if (childrenRefs.getByRef(actor).isDefined) {

      @tailrec def shallDie(ref: ActorRef): Boolean = {

        val c = childrenRefs

        swapChildrenRefs(c, c.shallDie(ref)) || shallDie(ref)

      }

      if (actor match {

        case r: RepointableRef ⇒ r.isStarted

        case _                 ⇒ true

      }) shallDie(actor)

    }

    actor.asInstanceOf[InternalActorRef].stop()

  }

　　其实比较简单，就是判断当前actor是否存在，若存在且已经启动则调用swapChildrenRefs，最后调用这个子actor的stop()方法，进行递归stop。

override def shallDie(actor: ActorRef): ChildrenContainer = TerminatingChildrenContainer(c, Set(actor), UserRequest)

　　shallDie其实就是创建一个TerminatingChildrenContainer，然后去替换childrenRefs。

@tailrec final protected def setChildrenTerminationReason(reason: ChildrenContainer.SuspendReason): Boolean = {

    childrenRefs match {

      case c: ChildrenContainer.TerminatingChildrenContainer ⇒

        swapChildrenRefs(c, c.copy(reason = reason)) || setChildrenTerminationReason(reason)

      case _ ⇒ false

    }

  }

　　最后一个if语句会调用setChildrenTerminationReason，此时childrenRefs已经是TerminatingChildrenContainer类型的了，所以会返回true。

private def finishTerminate() {

    val a = actor

    /* The following order is crucial for things to work properly. Only change this if you're very confident and lucky.

     *

     * Please note that if a parent is also a watcher then ChildTerminated and Terminated must be processed in this

     * specific order.

     */

    try if (a ne null) a.aroundPostStop()

    catch handleNonFatalOrInterruptedException { e ⇒ publish(Error(e, self.path.toString, clazz(a), e.getMessage)) }

    finally try dispatcher.detach(this)

    finally try parent.sendSystemMessage(DeathWatchNotification(self, existenceConfirmed = true, addressTerminated = false))

    finally try stopFunctionRefs()

    finally try tellWatchersWeDied()

    finally try unwatchWatchedActors(a) // stay here as we expect an emergency stop from handleInvokeFailure

    finally {

      if (system.settings.DebugLifecycle)

        publish(Debug(self.path.toString, clazz(a), "stopped"))

      clearActorFields(a, recreate = false)

      clearActorCellFields(this)

      actor = null

    }

  }

　　所以最终会调用finishTerminate，在finishTerminate代码中会去调用tellWatchersWeDied

protected def tellWatchersWeDied(): Unit =

    if (!watchedBy.isEmpty) {

      try {

        // Don't need to send to parent parent since it receives a DWN by default

        def sendTerminated(ifLocal: Boolean)(watcher: ActorRef): Unit =

          if (watcher.asInstanceOf[ActorRefScope].isLocal == ifLocal && watcher != parent)

            watcher.asInstanceOf[InternalActorRef].sendSystemMessage(DeathWatchNotification(self, existenceConfirmed = true, addressTerminated = false))

        /*

         * It is important to notify the remote watchers first, otherwise RemoteDaemon might shut down, causing

         * the remoting to shut down as well. At this point Terminated messages to remote watchers are no longer

         * deliverable.

         *

         * The problematic case is:

         *  1. Terminated is sent to RemoteDaemon

         *   1a. RemoteDaemon is fast enough to notify the terminator actor in RemoteActorRefProvider

         *   1b. The terminator is fast enough to enqueue the shutdown command in the remoting

         *  2. Only at this point is the Terminated (to be sent remotely) enqueued in the mailbox of remoting

         *

         * If the remote watchers are notified first, then the mailbox of the Remoting will guarantee the correct order.

         */

        watchedBy foreach sendTerminated(ifLocal = false)

        watchedBy foreach sendTerminated(ifLocal = true)

      } finally {

        maintainAddressTerminatedSubscription() {

          watchedBy = ActorCell.emptyActorRefSet

        }

      }

    }

　　tellWatchersWeDied做了什么呢？其实就是给watchedBy对应的actorRef发送DeathWatchNotification消息。请注意DeathWatchNotification的第一个参数是self，就是要stop的actor。

case DeathWatchNotification(a, ec, at) ⇒ watchedActorTerminated(a, ec, at)

　　而watcher收到DeathWatchNotification如何响应呢?

/**

   * When this actor is watching the subject of [[akka.actor.Terminated]] message

   * it will be propagated to user's receive.

   */

  protected def watchedActorTerminated(actor: ActorRef, existenceConfirmed: Boolean, addressTerminated: Boolean): Unit = {

    watchingGet(actor) match {

      case None ⇒ // We're apparently no longer watching this actor.

      case Some(optionalMessage) ⇒

        maintainAddressTerminatedSubscription(actor) {

          watching = removeFromMap(actor, watching)

        }

        if (!isTerminating) {

          self.tell(optionalMessage.getOrElse(Terminated(actor)(existenceConfirmed, addressTerminated)), actor)

          terminatedQueuedFor(actor)

        }

    }

    if (childrenRefs.getByRef(actor).isDefined) handleChildTerminated(actor)

  }

　　很明显watchedActorTerminated在当前actor处于正常状态，且已经监控了对应的actor时，会给自己发送一个Terminated（actor），或者Terminated（actor，msg）的消息。这样监控者就收到了被监控actor的Terminated消息了。

　　其实吧，抛开子actor状态的维护以及其他复杂的操作，简单来说就是，监控者保存自己监控了哪些actor，被监控者保存了自己被哪些actor监控了，在被监控者stop的最后一刻发送Terminated消息给监控者就好了。当然了，这还涉及到remote模式，此时就比较复杂，后面再分析。

Akka源码分析-local-DeathWatch的更多相关文章

Akka源码分析-Cluster-Metrics
一个应用软件维护的后期一定是要做监控,akka也不例外,它提供了集群模式下的度量扩展插件. 其实如果读者读过前面的系列文章的话,应该是能够自己写一个这样的监控工具的.简单来说就是创建一个actor,它 ...
Akka源码分析-Cluster-Distributed Publish Subscribe in Cluster
在ClusterClient源码分析中,我们知道,他是依托于“Distributed Publish Subscribe in Cluster”来实现消息的转发的,那本文就来分析一下Pub/Sub是如 ...
Akka源码分析-Persistence
在学习akka过程中,我们了解了它的监督机制,会发现actor非常可靠,可以自动的恢复.但akka框架只会简单的创建新的actor,然后调用对应的生命周期函数,如果actor有状态需要回复,我们需要h ...
Akka源码分析-Cluster-ActorSystem
前面几篇博客,我们依次介绍了local和remote的一些内容,其实再分析cluster就会简单很多,后面关于cluster的源码分析,能够省略的地方,就不再贴源码而是一句话带过了,如果有不理解的地方 ...
Akka源码分析-Akka Typed
对不起,akka typed 我是不准备进行源码分析的,首先这个库的API还没有release,所以会may change,也就意味着其概念和设计包括API都会修改,基本就没有再深入分析源码的意义了. ...
Akka源码分析-Akka-Streams-概念入门
今天我们来讲解akka-streams,这应该算akka框架下实现的一个很高级的工具.之前在学习akka streams的时候,我是觉得云里雾里的,感觉非常复杂,而且又难学,不过随着对akka源码的深 ...
Akka源码分析-Cluster-Singleton
akka Cluster基本实现原理已经分析过,其实它就是在remote基础上添加了gossip协议,同步各个节点信息,使集群内各节点能够识别.在Cluster中可能会有一个特殊的节点,叫做单例节点. ...
Akka源码分析-Akka-Streams-Materializer（1）
本博客逐步分析Akka Streams的源码,当然必须循序渐进,且估计会分很多篇,毕竟Akka Streams还是比较复杂的. implicit val system = ActorSystem(&q ...
Akka源码分析-Cluster-Sharding
个人觉得akka提供的cluster工具中,sharding是最吸引人的.当我们需要把actor分布在不同的节点上时,Cluster sharding非常有用.我们可以使用actor的逻辑标识符与ac ...

随机推荐

ubuntu 常见的操作命令
原博客地址为:https://blog.csdn.net/qq_33421080/article/details/76551554 1.cd命令: cd:切换到当前用户根目录,默认[/home/用户名 ...
linux diff-比较给定的两个文件的不同
推荐:更多Linux 文件查找和比较命令关注:linux命令大全 diff命令在最简单的情况下,比较给定的两个文件的不同.如果使用“-”代替“文件”参数,则要比较的内容将来自标准输入.diff命令是 ...
C语言结构体用法
结构体的定义: 方法一: struct student { char name[10]; int age; int number; }; struct student stu1; 方法二: struc ...
FJoi2017 1月20日模拟赛直线斯坦纳树(暴力+最小生成树+骗分+人工构造+随机乱搞)
[题目描述] 给定二维平面上n个整点,求该图的一个直线斯坦纳树,使得树的边长度总和尽量小. 直线斯坦纳树:使所有给定的点连通的树,所有边必须平行于坐标轴,允许在给定点外增加额外的中间节点. 如下图所示 ...
使用 XMLHttpRequest实现Ajax
[XMLHttpRequest的概述] 1.XMLHttpRequest最早是在IE5中以ActiveX组件的形式实现的.非W3C标准 2.创建XMLHttpRequest对象(由于非标准所以实现方法 ...
RabbitMQ-rabbitmqctl和插件使用(四)
rabbitmqctl 说明进入mq的bin目录 cd /usr/local/Cellar/rabbitmq/3.7.8/sbin ./rabbitmqctl [-n node] [-t timeo ...
poj 1659 判断是否能构成图Havel-Hakimi定理
//用到了Havel-Hakimi定理,判断是否能够构图 //两种情况不能构图,1:对剩下序列排序后,最大的度数超过了剩下的顶点数 // 2:对最大的度数后面的f个度数减-后,出现了负数 //记录到临 ...
阿里云nginx创建多站点
最近开始用阿里云的vps,用了它的一键安装包安装了php环境,nginx的.下面记录创建多站点的心得. 首先php安装好后会自带安装一个phpwind的站点. 文件目录存放在 /alidata/www ...
[Vue @Component] Load Vue Async Components
Vue provides a straight-forward syntax for loading components at runtime to help shave off initial b ...
SpriteBuilder&Cocos2D使用CCEffect特效实现天黑天亮过度效果
大熊猫猪·侯佩原创或翻译作品.欢迎转载,转载请注明出处. 假设认为写的不好请多提意见,假设认为不错请多多支持点赞.谢谢! hopy ;) 在动作或RPG类游戏中我们有时须要天黑和天亮过度的效果来完毕场 ...

Akka源码分析-local-DeathWatch

Akka源码分析-local-DeathWatch的更多相关文章

随机推荐

热门专题