struct streaming中的监听器StreamingQueryListener

在struct streaming提供了一个类，用来监听流的启动、停止、状态更新

StreamingQueryListener

实例化：StreamingQueryListener 后需要实现3个函数：

abstract class StreamingQueryListener {

import StreamingQueryListener._

/**

* Called when a query is started.

* @note This is called synchronously with

* [[org.apache.spark.sql.streaming.DataStreamWriter `DataStreamWriter.start()`]],

* that is, `onQueryStart` will be called on all listeners before

* `DataStreamWriter.start()` returns the corresponding [[StreamingQuery]]. Please

* don't block this method as it will block your query.

* @since 2.0.0

*/

def onQueryStarted(event: QueryStartedEvent): Unit

/**

* Called when there is some status update (ingestion rate updated, etc.)

*

* @note This method is asynchronous. The status in [[StreamingQuery]] will always be

* latest no matter when this method is called. Therefore, the status of [[StreamingQuery]]

* may be changed before/when you process the event. E.g., you may find [[StreamingQuery]]

* is terminated when you are processing `QueryProgressEvent`.

* @since 2.0.0

*/

def onQueryProgress(event: QueryProgressEvent): Unit

/**

* Called when a query is stopped, with or without error.

* @since 2.0.0

*/

def onQueryTerminated(event: QueryTerminatedEvent): Unit

}

onQueryStarted：结构化流启动的时候异步回调

onQueryProgress：查询过程中的状态发生更新时候的异步回调

onQueryTerminated：查询结束实时的异步回调

上面这些内容有什么作用？
一般在流处理中添加任务告警时候能用到。比如在onQueryStarted中判断是不是有满足告警的条件，如果有的话，就发送邮件告警或者钉钉告警灯
那么在告警信息中我们就可以根据其中的exception获取报错具体详情，然后一并发送到邮件中

@InterfaceStability.Evolving

class QueryTerminatedEvent private[sql](

val id: UUID,

val runId: UUID,

val exception: Option[String]) extends Event

最后，附上一个使用的小例子：

/**

  * Created by angel

  */

object Test {

  def main(args: Array[String]): Unit = {

    val spark = SparkSession

      .builder

      .appName("IQL")

      .master("local[4]")

      .enableHiveSupport()

      .getOrCreate()

    spark.sparkContext.setLogLevel("WARN")

    // Save the code as demo-StreamingQueryManager.scala

    // Start it using spark-shell

    // $ ./bin/spark-shell -i demo-StreamingQueryManager.scala

    // Register a StreamingQueryListener to receive notifications about state changes of streaming queries

    import org.apache.spark.sql.streaming.StreamingQueryListener

    val myQueryListener = new StreamingQueryListener {

      import org.apache.spark.sql.streaming.StreamingQueryListener._

      def onQueryTerminated(event: QueryTerminatedEvent): Unit = {

        println(s"Query ${event.id} terminated")

      }

      def onQueryStarted(event: QueryStartedEvent): Unit = {

        println(s"Query ${event.id} started")

      }

      def onQueryProgress(event: QueryProgressEvent): Unit = {

        println(s"Query ${event.progress.name} process")

      }

    }

    spark.streams.addListener(myQueryListener)

    import org.apache.spark.sql.streaming._

    import scala.concurrent.duration._

    // Start streaming queries

    // Start the first query

    val q4s = spark.readStream.

      format("rate").

      load.

      writeStream.

      format("console").

      trigger(Trigger.ProcessingTime(.seconds)).

      option("truncate", false).

      start

    // Start another query that is slightly slower

    val q10s = spark.readStream.

      format("rate").

      load.

      writeStream.

      format("console").

      trigger(Trigger.ProcessingTime(.seconds)).

      option("truncate", false).

      start

    // Both queries run concurrently

    // You should see different outputs in the console

    // q4s prints out 4 rows every batch and twice as often as q10s

    // q10s prints out 10 rows every batch

    /*

    -------------------------------------------

    Batch: 7

    -------------------------------------------

    +-----------------------+-----+

    |timestamp              |value|

    +-----------------------+-----+

    |2017-10-27 13:44:07.462|21   |

    |2017-10-27 13:44:08.462|22   |

    |2017-10-27 13:44:09.462|23   |

    |2017-10-27 13:44:10.462|24   |

    +-----------------------+-----+

    -------------------------------------------

    Batch: 8

    -------------------------------------------

    +-----------------------+-----+

    |timestamp              |value|

    +-----------------------+-----+

    |2017-10-27 13:44:11.462|25   |

    |2017-10-27 13:44:12.462|26   |

    |2017-10-27 13:44:13.462|27   |

    |2017-10-27 13:44:14.462|28   |

    +-----------------------+-----+

    -------------------------------------------

    Batch: 2

    -------------------------------------------

    +-----------------------+-----+

    |timestamp              |value|

    +-----------------------+-----+

    |2017-10-27 13:44:09.847|6    |

    |2017-10-27 13:44:10.847|7    |

    |2017-10-27 13:44:11.847|8    |

    |2017-10-27 13:44:12.847|9    |

    |2017-10-27 13:44:13.847|10   |

    |2017-10-27 13:44:14.847|11   |

    |2017-10-27 13:44:15.847|12   |

    |2017-10-27 13:44:16.847|13   |

    |2017-10-27 13:44:17.847|14   |

    |2017-10-27 13:44:18.847|15   |

    +-----------------------+-----+

    */

    // Stop q4s on a separate thread

    // as we're about to block the current thread awaiting query termination

    import java.util.concurrent.Executors

    import java.util.concurrent.TimeUnit.SECONDS

    def queryTerminator(query: StreamingQuery) = new Runnable {

      def run = {

        println(s"Stopping streaming query: ${query.id}")

        query.stop

      }

    }

    import java.util.concurrent.TimeUnit.SECONDS

    // Stop the first query after 10 seconds

    Executors.newSingleThreadScheduledExecutor.

      scheduleWithFixedDelay(queryTerminator(q4s), ,  * , SECONDS)

    // Stop the other query after 20 seconds

    Executors.newSingleThreadScheduledExecutor.

      scheduleWithFixedDelay(queryTerminator(q10s), ,  * , SECONDS)

    // Use StreamingQueryManager to wait for any query termination (either q1 or q2)

    // the current thread will block indefinitely until either streaming query has finished

    spark.streams.awaitAnyTermination

    // You are here only after either streaming query has finished

    // Executing spark.streams.awaitAnyTermination again would return immediately

    // You should have received the QueryTerminatedEvent for the query termination

    // reset the last terminated streaming query

    spark.streams.resetTerminated

    // You know at least one query has terminated

    // Wait for the other query to terminate

    spark.streams.awaitAnyTermination

    assert(spark.streams.active.isEmpty)

    println("The demo went all fine. Exiting...")

    // leave spark-shell

    System.exit()

  }

}

小例子

struct streaming中的监听器StreamingQueryListener的更多相关文章

spark streaming中使用checkpoint
从官方的Programming Guides中看到的我理解streaming中的checkpoint有两种,一种指的是metadata的checkpoint,用于恢复你的streaming:一种是r ...
Spark Streaming中向flume拉取数据
在这里看到的解决方法 https://issues.apache.org/jira/browse/SPARK-1729 请是个人理解,有问题请大家留言. 其实本身flume是不支持像KAFKA一样的发 ...
Spark Streaming中的操作函数分析
根据Spark官方文档中的描述,在Spark Streaming应用中,一个DStream对象可以调用多种操作,主要分为以下几类 Transformations Window Operations J ...
spark streaming中维护kafka偏移量到外部介质
spark streaming中维护kafka偏移量到外部介质以kafka偏移量维护到redis为例. redis存储格式使用的数据结构为string,其中key为topic:partition, ...
在web.xml中配置监听器来控制ioc容器生命周期
5.整合关键-在web.xml中配置监听器来控制ioc容器生命周期原因: 1.配置的组件太多,需保障单实例 2.项目停止后,ioc容器也需要关掉,降低对内存资源的占用. 项目启动创建容器,项目停止销 ...
在Java Web程序中使用监听器可以通过以下两种方法
之前学习了很多涉及servlet的内容,本小结我们说一下监听器,说起监听器,编过桌面程序和手机App的都不陌生,常见的套路都是拖一个控件,然后给它绑定一个监听器,即可以对该对象的事件进行监听以便发生响 ...
Button 在布局文件中定义监听器，文字阴影，自定义图片，代码绘制样式，添加音效的方法
1.Button自己在xml文件中绑定监听器 <LinearLayout xmlns:android="http://schemas.android.com/apk/res/andro ...
Kafka：ZK+Kafka+Spark Streaming集群环境搭建（十六）Structured Streaming中ForeachSink的用法
Structured Streaming默认支持的sink类型有File sink,Foreach sink,Console sink,Memory sink. ForeachWriter实现: 以写 ...
Spark Streaming中的操作函数讲解
Spark Streaming中的操作函数讲解根据根据Spark官方文档中的描述,在Spark Streaming应用中,一个DStream对象可以调用多种操作,主要分为以下几类 Transform ...

随机推荐

JS基础_非布尔值的与或运算
<!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <title> ...
题解 POJ1964/UVA1330/SP277 【City Game】
题目链接: https://www.luogu.org/problemnew/show/UVA1330 http://poj.org/problem?id=1964 https://www.luogu ...
vue中watch深度监听
监听基本类型的都是浅度监听 watch的深度监听,监听复杂类型都是深度监听(funciton ,arrat ,object) // 监听对象 data(){ return { a:{ b:, c: } ...
\ n是将输出换行符的javascript的转义符。
\ n是将输出换行符的javascript的转义符.<br/>是表示新文本行的HTML标签.JavaScript是一种脚本语言,而HTML是一种标记语言.如果使用javascript的文档 ...
maven入门-- part1 简介
Maven是什么 maven是基于项目对象模型(pom:project object model),可以通过一小段描述信息来管理项目的构建,报告和文档的项目管理工具.对依赖关系的特性进行细致的分析和划 ...
Cannot create OpenGL context for 'eglMakeCurrent'.
10.3.2编译的app,在小米手机上出这个问题,华为的正常. 解决方法: 窗口的Quality属性用SystemDefault,不要用HighQuality. 10.3.1也有此问题.
20、linux启动流程和救援模式
1.Linux启动流程 2.Linux运行级别 1.什么是运行级别,运行级别就是操作系统当前正在运行的功能级别 System V init运行级别 systemd目标名称作用 0 runlevel0 ...
vue中 localStorage的使用方法（详解）
vue中实现本地储存的方法:localStorage,在HTML5中,新加入了一个localStorage特性,这个特性主要是用来作为本地存储来使用的,解决了cookie存储空间不足的问题(cooki ...
“联邦对抗技术大赛”9月开战微众银行呼唤开发者共同“AI创新”
“联邦对抗技术大赛”9月开战微众银行呼唤开发者共同“AI创新” 从<第五元素>中的智能系统到<超体>中的信息操控,在科幻电影中人工智能已经发展到了极致.而在现实中,目前 ...
Selenium（5）
一.WebDriver结合Junit的使用 1.Junit中常用的断言 (1)assertEquals:断言实际结果与预期结果是否相等 Equals:相等格式:assertEquals(预期值,实际 ...

struct streaming中的监听器StreamingQueryListener

struct streaming中的监听器StreamingQueryListener的更多相关文章

随机推荐

热门专题