IT数码 购物 网址 头条 软件 日历 阅读 图书馆
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
图片批量下载器
↓批量下载图片,美女图库↓
图片自动播放器
↓图片自动播放器↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁
 
   -> 大数据 -> Spark Event Log -> 正文阅读

[大数据]Spark Event Log

什么是Spark Event Log

Spark Event Log是Spark内部主要的事件日志。比如Spark Application什么时候启动,什么时候关闭,什么时候Task开始,什么时候Task结束等。

为什么需要Spark Event Log

我们都知道Spark启动后回启动Spark UI,这个Spark UI可以帮助我们监控应用程序的状态。但是如果Spark应用跑完了,Spark UI就无法查看。如果Spark在执行过程中出了问题,我们没有办法去快速查找出问题的原因。所以我们需要把Spark Event持久化到磁盘。然后通过Spark History Server去读取Spark Event Log就可以重现运行时情况。可以快速的帮助我们分析问题。

如何开启Spark Event Log

Spark Event Log默认是没有开启的,如果要开启,需要指定如下参数:

spark.eventLog.enabled=true
spark.eventLog.dir=hdfs:///user/spark/applicationHistory

:Spark Event Log分存储位置比较灵活,可以是HDFS,本地,也可以是S3

Spark Event Log张什么样子

{"Event":"SparkListenerLogStart","Spark Version":"2.4.4"}
{"Event":"SparkListenerExecutorAdded","Timestamp":1629712726157,"Executor ID":"driver","Executor Info":{"Host":"localhost","Total Cores":1,"Log Urls":{}}}
{"Event":"SparkListenerBlockManagerAdded","Block Manager ID":{"Executor ID":"driver","Host":"ip-172-29-6-153.ap-northeast-2.compute.internal","Port":55466},"Maximum Memory":2101975449,"Timestamp":1629712726182,"Maximum Onheap Memory":2101975449,"Maximum Offheap Memory":0}
{"Event":"SparkListenerEnvironmentUpdate","JVM Information":{"Java Home":"/Library/Java/JavaVirtualMachines/jdk1.8.0_231.jdk/Contents/Home/jre","Java Version":"1.8.0_231 (Oracle Corporation)","Scala Version":"version 2.11.12"}}
{"Event":"SparkListenerApplicationStart","App Name":"TestEventLog","App ID":"local-1629712726102","Timestamp":1629712725234,"User":"darren.zhang"}
{"Event":"org.apache.spark.sql.execution.ui.SparkListenerSQLExecutionStart","executionId":0,"description":"csv at TestEventLog.scala:19","details":"org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:467)\nTestEventLog$.main(TestEventLog.scala:19)\nTestEventLog.main(TestEventLog.scala)","physicalPlanDescription":"== Parsed Logical Plan ==\nGlobalLimit 1\n+- LocalLimit 1\n   +- Filter (length(trim(value#0, None)) > 0)\n      +- Project [value#0]\n         +- Relation[value#0] text\n\n== Analyzed Logical Plan ==\nvalue: string\nGlobalLimit 1\n+- LocalLimit 1\n   +- Filter (length(trim(value#0, None)) > 0)\n      +- Project [value#0]\n         +- Relation[value#0] text\n\n== Optimized Logical Plan ==\nGlobalLimit 1\n+- LocalLimit 1\n   +- Filter (length(trim(value#0, None)) > 0)\n      +- Relation[value#0] text\n\n== Physical Plan ==\nCollectLimit 1\n+- *(1) Filter (length(trim(value#0, None)) > 0)\n   +- *(1) FileScan text [value#0] Batched: false, Format: Text, Location: InMemoryFileIndex[file:/Users/darren.zhang/project/workspace/darren-spark/test_file/key=2/file3.c..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<value:string>","sparkPlanInfo":{"nodeName":"CollectLimit","simpleString":"CollectLimit 1","children":[{"nodeName":"WholeStageCodegen","simpleString":"WholeStageCodegen","children":[{"nodeName":"Filter","simpleString":"Filter (length(trim(value#0, None)) > 0)","children":[{"nodeName":"Scan text ","simpleString":"FileScan text [value#0] Batched: false, Format: Text, Location: InMemoryFileIndex[file:/Users/darren.zhang/project/workspace/darren-spark/test_file/key=2/file3.c..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<value:string>","children":[],"metadata":{"Location":"InMemoryFileIndex[file:/Users/darren.zhang/project/workspace/darren-spark/test_file/key=2/file3.csv, file:/Users/darren.zhang/project/workspace/darren-spark/test_file/key=2/file4.csv, file:/Users/darren.zhang/project/workspace/darren-spark/test_file/key=3/file6.csv, file:/Users/darren.zhang/project/workspace/darren-spark/test_file/key=3/file5.csv, file:/Users/darren.zhang/project/workspace/darren-spark/test_file/key=1/file1.csv, file:/Users/darren.zhang/project/workspace/darren-spark/test_file/key=1/file2.csv]","ReadSchema":"struct<value:string>","Format":"Text","Batched":"false","PartitionFilters":"[]","PushedFilters":"[]"},"metrics":[{"name":"number of output rows","accumulatorId":2,"metricType":"sum"},{"name":"number of files","accumulatorId":3,"metricType":"sum"},{"name":"metadata time (ms)","accumulatorId":4,"metricType":"sum"},{"name":"scan time total (min, med, max)","accumulatorId":5,"metricType":"timing"}]}],"metadata":{},"metrics":[{"name":"number of output rows","accumulatorId":1,"metricType":"sum"}]}],"metadata":{},"metrics":[{"name":"duration total (min, med, max)","accumulatorId":0,"metricType":"timing"}]}],"metadata":{},"metrics":[]},"time":1629712728178}
{"Event":"org.apache.spark.sql.execution.ui.SparkListenerDriverAccumUpdates","executionId":0,"accumUpdates":[[3,6],[4,1]]}
{"Event":"SparkListenerJobStart","Job ID":0,"Submission Time":1629712728711,"Stage Infos":[{"Stage ID":0,"Stage Attempt ID":0,"Stage Name":"csv at TestEventLog.scala:19","Number of Tasks":1,"RDD Info":[{"RDD ID":3,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"4\",\"name\":\"map\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[2],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":0,"Name":"FileScanRDD","Scope":"{\"id\":\"0\",\"name\":\"WholeStageCodegen\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":1,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"0\",\"name\":\"WholeStageCodegen\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[0],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":2,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"3\",\"name\":\"mapPartitionsInternal\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[1],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[],"Details":"org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:467)\nTestEventLog$.main(TestEventLog.scala:19)\nTestEventLog.main(TestEventLog.scala)","Accumulables":[]}],"Stage IDs":[0],"Properties":{"spark.driver.host":"ip-172-29-6-153.ap-northeast-2.compute.internal","spark.eventLog.enabled":"true","spark.driver.port":"55465","spark.app.name":"TestEventLog","spark.executor.id":"driver","spark.master":"local","spark.eventLog.dir":"file:///Users/darren.zhang/test/","spark.sql.execution.id":"0","spark.app.id":"local-1629712726102"}}
{"Event":"SparkListenerStageSubmitted","Stage Info":{"Stage ID":0,"Stage Attempt ID":0,"Stage Name":"csv at TestEventLog.scala:19","Number of Tasks":1,"RDD Info":[{"RDD ID":3,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"4\",\"name\":\"map\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[2],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":0,"Name":"FileScanRDD","Scope":"{\"id\":\"0\",\"name\":\"WholeStageCodegen\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":1,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"0\",\"name\":\"WholeStageCodegen\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[0],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":2,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"3\",\"name\":\"mapPartitionsInternal\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[1],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[],"Details":"org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:467)\nTestEventLog$.main(TestEventLog.scala:19)\nTestEventLog.main(TestEventLog.scala)","Submission Time":1629712728723,"Accumulables":[]},"Properties":{"spark.driver.host":"ip-172-29-6-153.ap-northeast-2.compute.internal","spark.eventLog.enabled":"true","spark.driver.port":"55465","spark.app.name":"TestEventLog","spark.executor.id":"driver","spark.master":"local","spark.eventLog.dir":"file:///Users/darren.zhang/test/","spark.sql.execution.id":"0","spark.app.id":"local-1629712726102"}}
{"Event":"SparkListenerTaskStart","Stage ID":0,"Stage Attempt ID":0,"Task Info":{"Task ID":0,"Index":0,"Attempt":0,"Launch Time":1629712728789,"Executor ID":"driver","Host":"localhost","Locality":"PROCESS_LOCAL","Speculative":false,"Getting Result Time":0,"Finish Time":0,"Failed":false,"Killed":false,"Accumulables":[]}}
{"Event":"SparkListenerTaskEnd","Stage ID":0,"Stage Attempt ID":0,"Task Type":"ResultTask","Task End Reason":{"Reason":"Success"},"Task Info":{"Task ID":0,"Index":0,"Attempt":0,"Launch Time":1629712728789,"Executor ID":"driver","Host":"localhost","Locality":"PROCESS_LOCAL","Speculative":false,"Getting Result Time":0,"Finish Time":1629712728902,"Failed":false,"Killed":false,"Accumulables":[{"ID":1,"Name":"number of output rows","Update":"1","Value":"1","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":2,"Name":"number of output rows","Update":"1","Value":"1","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":28,"Name":"internal.metrics.input.recordsRead","Update":1,"Value":1,"Internal":true,"Count Failed Values":true},{"ID":27,"Name":"internal.metrics.input.bytesRead","Update":25,"Value":25,"Internal":true,"Count Failed Values":true},{"ID":10,"Name":"internal.metrics.resultSize","Update":1275,"Value":1275,"Internal":true,"Count Failed Values":true},{"ID":9,"Name":"internal.metrics.executorCpuTime","Update":68016000,"Value":68016000,"Internal":true,"Count Failed Values":true},{"ID":8,"Name":"internal.metrics.executorRunTime","Update":72,"Value":72,"Internal":true,"Count Failed Values":true},{"ID":7,"Name":"internal.metrics.executorDeserializeCpuTime","Update":15237000,"Value":15237000,"Internal":true,"Count Failed Values":true},{"ID":6,"Name":"internal.metrics.executorDeserializeTime","Update":17,"Value":17,"Internal":true,"Count Failed Values":true}]},"Task Metrics":{"Executor Deserialize Time":17,"Executor Deserialize CPU Time":15237000,"Executor Run Time":72,"Executor CPU Time":68016000,"Result Size":1275,"JVM GC Time":0,"Result Serialization Time":0,"Memory Bytes Spilled":0,"Disk Bytes Spilled":0,"Shuffle Read Metrics":{"Remote Blocks Fetched":0,"Local Blocks Fetched":0,"Fetch Wait Time":0,"Remote Bytes Read":0,"Remote Bytes Read To Disk":0,"Local Bytes Read":0,"Total Records Read":0},"Shuffle Write Metrics":{"Shuffle Bytes Written":0,"Shuffle Write Time":0,"Shuffle Records Written":0},"Input Metrics":{"Bytes Read":25,"Records Read":1},"Output Metrics":{"Bytes Written":0,"Records Written":0},"Updated Blocks":[]}}
{"Event":"SparkListenerStageCompleted","Stage Info":{"Stage ID":0,"Stage Attempt ID":0,"Stage Name":"csv at TestEventLog.scala:19","Number of Tasks":1,"RDD Info":[{"RDD ID":3,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"4\",\"name\":\"map\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[2],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":0,"Name":"FileScanRDD","Scope":"{\"id\":\"0\",\"name\":\"WholeStageCodegen\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":1,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"0\",\"name\":\"WholeStageCodegen\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[0],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":2,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"3\",\"name\":\"mapPartitionsInternal\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[1],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[],"Details":"org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:467)\nTestEventLog$.main(TestEventLog.scala:19)\nTestEventLog.main(TestEventLog.scala)","Submission Time":1629712728723,"Completion Time":1629712728910,"Accumulables":[{"ID":8,"Name":"internal.metrics.executorRunTime","Value":72,"Internal":true,"Count Failed Values":true},{"ID":2,"Name":"number of output rows","Value":"1","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":7,"Name":"internal.metrics.executorDeserializeCpuTime","Value":15237000,"Internal":true,"Count Failed Values":true},{"ID":1,"Name":"number of output rows","Value":"1","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":28,"Name":"internal.metrics.input.recordsRead","Value":1,"Internal":true,"Count Failed Values":true},{"ID":10,"Name":"internal.metrics.resultSize","Value":1275,"Internal":true,"Count Failed Values":true},{"ID":27,"Name":"internal.metrics.input.bytesRead","Value":25,"Internal":true,"Count Failed Values":true},{"ID":9,"Name":"internal.metrics.executorCpuTime","Value":68016000,"Internal":true,"Count Failed Values":true},{"ID":6,"Name":"internal.metrics.executorDeserializeTime","Value":17,"Internal":true,"Count Failed Values":true}]}}
{"Event":"SparkListenerJobEnd","Job ID":0,"Completion Time":1629712728913,"Job Result":{"Result":"JobSucceeded"}}
{"Event":"org.apache.spark.sql.execution.ui.SparkListenerSQLExecutionEnd","executionId":0,"time":1629712728918}
{"Event":"SparkListenerJobStart","Job ID":1,"Submission Time":1629712729028,"Stage Infos":[{"Stage ID":1,"Stage Attempt ID":0,"Stage Name":"csv at TestEventLog.scala:19","Number of Tasks":1,"RDD Info":[{"RDD ID":8,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"13\",\"name\":\"mapPartitions\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[7],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":4,"Name":"FileScanRDD","Scope":"{\"id\":\"10\",\"name\":\"WholeStageCodegen\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":6,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"9\",\"name\":\"DeserializeToObject\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[5],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":7,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"12\",\"name\":\"mapPartitions\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[6],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":5,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"10\",\"name\":\"WholeStageCodegen\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[4],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[],"Details":"org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:467)\nTestEventLog$.main(TestEventLog.scala:19)\nTestEventLog.main(TestEventLog.scala)","Accumulables":[]}],"Stage IDs":[1],"Properties":{"spark.rdd.scope.noOverride":"true","spark.rdd.scope":"{\"id\":\"14\",\"name\":\"aggregate\"}"}}
{"Event":"SparkListenerStageSubmitted","Stage Info":{"Stage ID":1,"Stage Attempt ID":0,"Stage Name":"csv at TestEventLog.scala:19","Number of Tasks":1,"RDD Info":[{"RDD ID":8,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"13\",\"name\":\"mapPartitions\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[7],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":4,"Name":"FileScanRDD","Scope":"{\"id\":\"10\",\"name\":\"WholeStageCodegen\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":6,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"9\",\"name\":\"DeserializeToObject\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[5],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":7,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"12\",\"name\":\"mapPartitions\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[6],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":5,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"10\",\"name\":\"WholeStageCodegen\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[4],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[],"Details":"org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:467)\nTestEventLog$.main(TestEventLog.scala:19)\nTestEventLog.main(TestEventLog.scala)","Submission Time":1629712729029,"Accumulables":[]},"Properties":{"spark.rdd.scope.noOverride":"true","spark.rdd.scope":"{\"id\":\"14\",\"name\":\"aggregate\"}"}}
{"Event":"SparkListenerTaskStart","Stage ID":1,"Stage Attempt ID":0,"Task Info":{"Task ID":1,"Index":0,"Attempt":0,"Launch Time":1629712729035,"Executor ID":"driver","Host":"localhost","Locality":"PROCESS_LOCAL","Speculative":false,"Getting Result Time":0,"Finish Time":0,"Failed":false,"Killed":false,"Accumulables":[]}}
{"Event":"SparkListenerTaskEnd","Stage ID":1,"Stage Attempt ID":0,"Task Type":"ResultTask","Task End Reason":{"Reason":"Success"},"Task Info":{"Task ID":1,"Index":0,"Attempt":0,"Launch Time":1629712729035,"Executor ID":"driver","Host":"localhost","Locality":"PROCESS_LOCAL","Speculative":false,"Getting Result Time":0,"Finish Time":1629712729066,"Failed":false,"Killed":false,"Accumulables":[{"ID":31,"Name":"number of output rows","Update":"12","Value":"12","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":35,"Name":"duration total (min, med, max)","Update":"17","Value":"16","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":58,"Name":"internal.metrics.input.recordsRead","Update":12,"Value":12,"Internal":true,"Count Failed Values":true},{"ID":57,"Name":"internal.metrics.input.bytesRead","Update":133,"Value":133,"Internal":true,"Count Failed Values":true},{"ID":40,"Name":"internal.metrics.resultSize","Update":1529,"Value":1529,"Internal":true,"Count Failed Values":true},{"ID":39,"Name":"internal.metrics.executorCpuTime","Update":19235000,"Value":19235000,"Internal":true,"Count Failed Values":true},{"ID":38,"Name":"internal.metrics.executorRunTime","Update":21,"Value":21,"Internal":true,"Count Failed Values":true},{"ID":37,"Name":"internal.metrics.executorDeserializeCpuTime","Update":5552000,"Value":5552000,"Internal":true,"Count Failed Values":true},{"ID":36,"Name":"internal.metrics.executorDeserializeTime","Update":6,"Value":6,"Internal":true,"Count Failed Values":true}]},"Task Metrics":{"Executor Deserialize Time":6,"Executor Deserialize CPU Time":5552000,"Executor Run Time":21,"Executor CPU Time":19235000,"Result Size":1529,"JVM GC Time":0,"Result Serialization Time":0,"Memory Bytes Spilled":0,"Disk Bytes Spilled":0,"Shuffle Read Metrics":{"Remote Blocks Fetched":0,"Local Blocks Fetched":0,"Fetch Wait Time":0,"Remote Bytes Read":0,"Remote Bytes Read To Disk":0,"Local Bytes Read":0,"Total Records Read":0},"Shuffle Write Metrics":{"Shuffle Bytes Written":0,"Shuffle Write Time":0,"Shuffle Records Written":0},"Input Metrics":{"Bytes Read":133,"Records Read":12},"Output Metrics":{"Bytes Written":0,"Records Written":0},"Updated Blocks":[]}}
{"Event":"SparkListenerStageCompleted","Stage Info":{"Stage ID":1,"Stage Attempt ID":0,"Stage Name":"csv at TestEventLog.scala:19","Number of Tasks":1,"RDD Info":[{"RDD ID":8,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"13\",\"name\":\"mapPartitions\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[7],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":4,"Name":"FileScanRDD","Scope":"{\"id\":\"10\",\"name\":\"WholeStageCodegen\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":6,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"9\",\"name\":\"DeserializeToObject\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[5],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":7,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"12\",\"name\":\"mapPartitions\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[6],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":5,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"10\",\"name\":\"WholeStageCodegen\"}","Callsite":"csv at TestEventLog.scala:19","Parent IDs":[4],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[],"Details":"org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:467)\nTestEventLog$.main(TestEventLog.scala:19)\nTestEventLog.main(TestEventLog.scala)","Submission Time":1629712729029,"Completion Time":1629712729066,"Accumulables":[{"ID":35,"Name":"duration total (min, med, max)","Value":"16","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":38,"Name":"internal.metrics.executorRunTime","Value":21,"Internal":true,"Count Failed Values":true},{"ID":31,"Name":"number of output rows","Value":"12","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":58,"Name":"internal.metrics.input.recordsRead","Value":12,"Internal":true,"Count Failed Values":true},{"ID":40,"Name":"internal.metrics.resultSize","Value":1529,"Internal":true,"Count Failed Values":true},{"ID":37,"Name":"internal.metrics.executorDeserializeCpuTime","Value":5552000,"Internal":true,"Count Failed Values":true},{"ID":36,"Name":"internal.metrics.executorDeserializeTime","Value":6,"Internal":true,"Count Failed Values":true},{"ID":57,"Name":"internal.metrics.input.bytesRead","Value":133,"Internal":true,"Count Failed Values":true},{"ID":39,"Name":"internal.metrics.executorCpuTime","Value":19235000,"Internal":true,"Count Failed Values":true}]}}
{"Event":"SparkListenerJobEnd","Job ID":1,"Completion Time":1629712729067,"Job Result":{"Result":"JobSucceeded"}}
{"Event":"org.apache.spark.sql.execution.ui.SparkListenerSQLExecutionStart","executionId":1,"description":"count at TestEventLog.scala:22","details":"org.apache.spark.sql.Dataset.count(Dataset.scala:2835)\nTestEventLog$.main(TestEventLog.scala:22)\nTestEventLog.main(TestEventLog.scala)","physicalPlanDescription":"== Parsed Logical Plan ==\nAggregate [count(1) AS count#43L]\n+- Relation[id#10,name#11,age#12,key#13] csv\n\n== Analyzed Logical Plan ==\ncount: bigint\nAggregate [count(1) AS count#43L]\n+- Relation[id#10,name#11,age#12,key#13] csv\n\n== Optimized Logical Plan ==\nAggregate [count(1) AS count#43L]\n+- Project\n   +- InMemoryRelation [id#10, name#11, age#12, key#13], StorageLevel(disk, memory, deserialized, 1 replicas)\n         +- *(1) FileScan csv [id#10,name#11,age#12,key#13] Batched: false, Format: CSV, Location: InMemoryFileIndex[file:/Users/darren.zhang/project/workspace/darren-spark/test_file], PartitionCount: 3, PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:int,name:string,age:int>\n\n== Physical Plan ==\n*(2) HashAggregate(keys=[], functions=[count(1)], output=[count#43L])\n+- Exchange SinglePartition\n   +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#66L])\n      +- InMemoryTableScan\n            +- InMemoryRelation [id#10, name#11, age#12, key#13], StorageLevel(disk, memory, deserialized, 1 replicas)\n                  +- *(1) FileScan csv [id#10,name#11,age#12,key#13] Batched: false, Format: CSV, Location: InMemoryFileIndex[file:/Users/darren.zhang/project/workspace/darren-spark/test_file], PartitionCount: 3, PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:int,name:string,age:int>","sparkPlanInfo":{"nodeName":"WholeStageCodegen","simpleString":"WholeStageCodegen","children":[{"nodeName":"HashAggregate","simpleString":"HashAggregate(keys=[], functions=[count(1)])","children":[{"nodeName":"InputAdapter","simpleString":"InputAdapter","children":[{"nodeName":"Exchange","simpleString":"Exchange SinglePartition","children":[{"nodeName":"WholeStageCodegen","simpleString":"WholeStageCodegen","children":[{"nodeName":"HashAggregate","simpleString":"HashAggregate(keys=[], functions=[partial_count(1)])","children":[{"nodeName":"InputAdapter","simpleString":"InputAdapter","children":[{"nodeName":"InMemoryTableScan","simpleString":"InMemoryTableScan","children":[],"metadata":{},"metrics":[{"name":"number of output rows","accumulatorId":76,"metricType":"sum"},{"name":"scan time total (min, med, max)","accumulatorId":77,"metricType":"timing"}]}],"metadata":{},"metrics":[]}],"metadata":{},"metrics":[{"name":"spill size total (min, med, max)","accumulatorId":73,"metricType":"size"},{"name":"aggregate time total (min, med, max)","accumulatorId":74,"metricType":"timing"},{"name":"peak memory total (min, med, max)","accumulatorId":72,"metricType":"size"},{"name":"number of output rows","accumulatorId":71,"metricType":"sum"},{"name":"avg hash probe (min, med, max)","accumulatorId":75,"metricType":"average"}]}],"metadata":{},"metrics":[{"name":"duration total (min, med, max)","accumulatorId":70,"metricType":"timing"}]}],"metadata":{},"metrics":[{"name":"data size total (min, med, max)","accumulatorId":63,"metricType":"size"}]}],"metadata":{},"metrics":[]}],"metadata":{},"metrics":[{"name":"spill size total (min, med, max)","accumulatorId":67,"metricType":"size"},{"name":"aggregate time total (min, med, max)","accumulatorId":68,"metricType":"timing"},{"name":"peak memory total (min, med, max)","accumulatorId":66,"metricType":"size"},{"name":"number of output rows","accumulatorId":65,"metricType":"sum"},{"name":"avg hash probe (min, med, max)","accumulatorId":69,"metricType":"average"}]}],"metadata":{},"metrics":[{"name":"duration total (min, med, max)","accumulatorId":64,"metricType":"timing"}]},"time":1629712729135}
{"Event":"org.apache.spark.sql.execution.ui.SparkListenerDriverAccumUpdates","executionId":1,"accumUpdates":[[79,6],[80,0]]}
{"Event":"SparkListenerJobStart","Job ID":2,"Submission Time":1629712729223,"Stage Infos":[{"Stage ID":2,"Stage Attempt ID":0,"Stage Name":"count at TestEventLog.scala:22","Number of Tasks":1,"RDD Info":[{"RDD ID":16,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"23\",\"name\":\"Exchange\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[15],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":10,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"28\",\"name\":\"WholeStageCodegen\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[9],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":11,"Name":"*(1) FileScan csv [id#10,name#11,age#12,key#13] Batched: false, Format: CSV, Location: InMemoryFileIndex[file:/Users/darren.zhang/project/workspace/darren-spark/test_file], PartitionCount: 3, PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:int,name:string,age:int>\n","Scope":"{\"id\":\"27\",\"name\":\"InMemoryTableScan\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[10],"Storage Level":{"Use Disk":true,"Use Memory":true,"Deserialized":true,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":9,"Name":"FileScanRDD","Scope":"{\"id\":\"28\",\"name\":\"WholeStageCodegen\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":13,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"27\",\"name\":\"InMemoryTableScan\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[11],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":14,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"27\",\"name\":\"InMemoryTableScan\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[13],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":15,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"24\",\"name\":\"WholeStageCodegen\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[14],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[],"Details":"org.apache.spark.sql.Dataset.count(Dataset.scala:2835)\nTestEventLog$.main(TestEventLog.scala:22)\nTestEventLog.main(TestEventLog.scala)","Accumulables":[]},{"Stage ID":3,"Stage Attempt ID":0,"Stage Name":"count at TestEventLog.scala:22","Number of Tasks":1,"RDD Info":[{"RDD ID":19,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"30\",\"name\":\"mapPartitionsInternal\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[18],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":17,"Name":"ShuffledRowRDD","Scope":"{\"id\":\"23\",\"name\":\"Exchange\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[16],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":18,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"20\",\"name\":\"WholeStageCodegen\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[17],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[2],"Details":"org.apache.spark.sql.Dataset.count(Dataset.scala:2835)\nTestEventLog$.main(TestEventLog.scala:22)\nTestEventLog.main(TestEventLog.scala)","Accumulables":[]}],"Stage IDs":[2,3],"Properties":{"spark.driver.host":"ip-172-29-6-153.ap-northeast-2.compute.internal","spark.eventLog.enabled":"true","spark.driver.port":"55465","spark.app.name":"TestEventLog","spark.rdd.scope":"{\"id\":\"31\",\"name\":\"collect\"}","spark.rdd.scope.noOverride":"true","spark.executor.id":"driver","spark.master":"local","spark.eventLog.dir":"file:///Users/darren.zhang/test/","spark.sql.execution.id":"1","spark.app.id":"local-1629712726102"}}
{"Event":"SparkListenerStageSubmitted","Stage Info":{"Stage ID":2,"Stage Attempt ID":0,"Stage Name":"count at TestEventLog.scala:22","Number of Tasks":1,"RDD Info":[{"RDD ID":16,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"23\",\"name\":\"Exchange\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[15],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":10,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"28\",\"name\":\"WholeStageCodegen\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[9],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":11,"Name":"*(1) FileScan csv [id#10,name#11,age#12,key#13] Batched: false, Format: CSV, Location: InMemoryFileIndex[file:/Users/darren.zhang/project/workspace/darren-spark/test_file], PartitionCount: 3, PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:int,name:string,age:int>\n","Scope":"{\"id\":\"27\",\"name\":\"InMemoryTableScan\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[10],"Storage Level":{"Use Disk":true,"Use Memory":true,"Deserialized":true,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":9,"Name":"FileScanRDD","Scope":"{\"id\":\"28\",\"name\":\"WholeStageCodegen\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":13,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"27\",\"name\":\"InMemoryTableScan\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[11],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":14,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"27\",\"name\":\"InMemoryTableScan\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[13],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":15,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"24\",\"name\":\"WholeStageCodegen\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[14],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[],"Details":"org.apache.spark.sql.Dataset.count(Dataset.scala:2835)\nTestEventLog$.main(TestEventLog.scala:22)\nTestEventLog.main(TestEventLog.scala)","Submission Time":1629712729228,"Accumulables":[]},"Properties":{"spark.driver.host":"ip-172-29-6-153.ap-northeast-2.compute.internal","spark.eventLog.enabled":"true","spark.driver.port":"55465","spark.app.name":"TestEventLog","spark.rdd.scope":"{\"id\":\"31\",\"name\":\"collect\"}","spark.rdd.scope.noOverride":"true","spark.executor.id":"driver","spark.master":"local","spark.eventLog.dir":"file:///Users/darren.zhang/test/","spark.sql.execution.id":"1","spark.app.id":"local-1629712726102"}}
{"Event":"SparkListenerTaskStart","Stage ID":2,"Stage Attempt ID":0,"Task Info":{"Task ID":2,"Index":0,"Attempt":0,"Launch Time":1629712729250,"Executor ID":"driver","Host":"localhost","Locality":"PROCESS_LOCAL","Speculative":false,"Getting Result Time":0,"Finish Time":0,"Failed":false,"Killed":false,"Accumulables":[]}}
{"Event":"SparkListenerTaskEnd","Stage ID":2,"Stage Attempt ID":0,"Task Type":"ShuffleMapTask","Task End Reason":{"Reason":"Success"},"Task Info":{"Task ID":2,"Index":0,"Attempt":0,"Launch Time":1629712729250,"Executor ID":"driver","Host":"localhost","Locality":"PROCESS_LOCAL","Speculative":false,"Getting Result Time":0,"Finish Time":1629712729392,"Failed":false,"Killed":false,"Accumulables":[{"ID":63,"Name":"data size total (min, med, max)","Update":"15","Value":"14","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":71,"Name":"number of output rows","Update":"1","Value":"1","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":70,"Name":"duration total (min, med, max)","Update":"14","Value":"13","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":76,"Name":"number of output rows","Update":"6","Value":"6","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":78,"Name":"number of output rows","Update":"6","Value":"6","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":82,"Name":"duration total (min, med, max)","Update":"115","Value":"114","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":105,"Name":"internal.metrics.input.recordsRead","Update":6,"Value":6,"Internal":true,"Count Failed Values":true},{"ID":104,"Name":"internal.metrics.input.bytesRead","Update":133,"Value":133,"Internal":true,"Count Failed Values":true},{"ID":103,"Name":"internal.metrics.shuffle.write.writeTime","Update":10557643,"Value":10557643,"Internal":true,"Count Failed Values":true},{"ID":102,"Name":"internal.metrics.shuffle.write.recordsWritten","Update":1,"Value":1,"Internal":true,"Count Failed Values":true},{"ID":101,"Name":"internal.metrics.shuffle.write.bytesWritten","Update":59,"Value":59,"Internal":true,"Count Failed Values":true},{"ID":89,"Name":"internal.metrics.resultSerializationTime","Update":2,"Value":2,"Internal":true,"Count Failed Values":true},{"ID":87,"Name":"internal.metrics.resultSize","Update":1952,"Value":1952,"Internal":true,"Count Failed Values":true},{"ID":86,"Name":"internal.metrics.executorCpuTime","Update":126048000,"Value":126048000,"Internal":true,"Count Failed Values":true},{"ID":85,"Name":"internal.metrics.executorRunTime","Update":130,"Value":130,"Internal":true,"Count Failed Values":true},{"ID":84,"Name":"internal.metrics.executorDeserializeCpuTime","Update":5819000,"Value":5819000,"Internal":true,"Count Failed Values":true},{"ID":83,"Name":"internal.metrics.executorDeserializeTime","Update":6,"Value":6,"Internal":true,"Count Failed Values":true}]},"Task Metrics":{"Executor Deserialize Time":6,"Executor Deserialize CPU Time":5819000,"Executor Run Time":130,"Executor CPU Time":126048000,"Result Size":1952,"JVM GC Time":0,"Result Serialization Time":2,"Memory Bytes Spilled":0,"Disk Bytes Spilled":0,"Shuffle Read Metrics":{"Remote Blocks Fetched":0,"Local Blocks Fetched":0,"Fetch Wait Time":0,"Remote Bytes Read":0,"Remote Bytes Read To Disk":0,"Local Bytes Read":0,"Total Records Read":0},"Shuffle Write Metrics":{"Shuffle Bytes Written":59,"Shuffle Write Time":10557643,"Shuffle Records Written":1},"Input Metrics":{"Bytes Read":133,"Records Read":6},"Output Metrics":{"Bytes Written":0,"Records Written":0},"Updated Blocks":[]}}
{"Event":"SparkListenerStageCompleted","Stage Info":{"Stage ID":2,"Stage Attempt ID":0,"Stage Name":"count at TestEventLog.scala:22","Number of Tasks":1,"RDD Info":[{"RDD ID":16,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"23\",\"name\":\"Exchange\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[15],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":10,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"28\",\"name\":\"WholeStageCodegen\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[9],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":11,"Name":"*(1) FileScan csv [id#10,name#11,age#12,key#13] Batched: false, Format: CSV, Location: InMemoryFileIndex[file:/Users/darren.zhang/project/workspace/darren-spark/test_file], PartitionCount: 3, PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:int,name:string,age:int>\n","Scope":"{\"id\":\"27\",\"name\":\"InMemoryTableScan\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[10],"Storage Level":{"Use Disk":true,"Use Memory":true,"Deserialized":true,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":9,"Name":"FileScanRDD","Scope":"{\"id\":\"28\",\"name\":\"WholeStageCodegen\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":13,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"27\",\"name\":\"InMemoryTableScan\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[11],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":14,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"27\",\"name\":\"InMemoryTableScan\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[13],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":15,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"24\",\"name\":\"WholeStageCodegen\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[14],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[],"Details":"org.apache.spark.sql.Dataset.count(Dataset.scala:2835)\nTestEventLog$.main(TestEventLog.scala:22)\nTestEventLog.main(TestEventLog.scala)","Submission Time":1629712729228,"Completion Time":1629712729393,"Accumulables":[{"ID":83,"Name":"internal.metrics.executorDeserializeTime","Value":6,"Internal":true,"Count Failed Values":true},{"ID":101,"Name":"internal.metrics.shuffle.write.bytesWritten","Value":59,"Internal":true,"Count Failed Values":true},{"ID":104,"Name":"internal.metrics.input.bytesRead","Value":133,"Internal":true,"Count Failed Values":true},{"ID":86,"Name":"internal.metrics.executorCpuTime","Value":126048000,"Internal":true,"Count Failed Values":true},{"ID":89,"Name":"internal.metrics.resultSerializationTime","Value":2,"Internal":true,"Count Failed Values":true},{"ID":71,"Name":"number of output rows","Value":"1","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":82,"Name":"duration total (min, med, max)","Value":"114","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":85,"Name":"internal.metrics.executorRunTime","Value":130,"Internal":true,"Count Failed Values":true},{"ID":76,"Name":"number of output rows","Value":"6","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":103,"Name":"internal.metrics.shuffle.write.writeTime","Value":10557643,"Internal":true,"Count Failed Values":true},{"ID":70,"Name":"duration total (min, med, max)","Value":"13","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":87,"Name":"internal.metrics.resultSize","Value":1952,"Internal":true,"Count Failed Values":true},{"ID":105,"Name":"internal.metrics.input.recordsRead","Value":6,"Internal":true,"Count Failed Values":true},{"ID":78,"Name":"number of output rows","Value":"6","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":63,"Name":"data size total (min, med, max)","Value":"14","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":84,"Name":"internal.metrics.executorDeserializeCpuTime","Value":5819000,"Internal":true,"Count Failed Values":true},{"ID":102,"Name":"internal.metrics.shuffle.write.recordsWritten","Value":1,"Internal":true,"Count Failed Values":true}]}}
{"Event":"SparkListenerStageSubmitted","Stage Info":{"Stage ID":3,"Stage Attempt ID":0,"Stage Name":"count at TestEventLog.scala:22","Number of Tasks":1,"RDD Info":[{"RDD ID":19,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"30\",\"name\":\"mapPartitionsInternal\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[18],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":17,"Name":"ShuffledRowRDD","Scope":"{\"id\":\"23\",\"name\":\"Exchange\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[16],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":18,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"20\",\"name\":\"WholeStageCodegen\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[17],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[2],"Details":"org.apache.spark.sql.Dataset.count(Dataset.scala:2835)\nTestEventLog$.main(TestEventLog.scala:22)\nTestEventLog.main(TestEventLog.scala)","Submission Time":1629712729403,"Accumulables":[]},"Properties":{"spark.driver.host":"ip-172-29-6-153.ap-northeast-2.compute.internal","spark.eventLog.enabled":"true","spark.driver.port":"55465","spark.app.name":"TestEventLog","spark.rdd.scope":"{\"id\":\"31\",\"name\":\"collect\"}","spark.rdd.scope.noOverride":"true","spark.executor.id":"driver","spark.master":"local","spark.eventLog.dir":"file:///Users/darren.zhang/test/","spark.sql.execution.id":"1","spark.app.id":"local-1629712726102"}}
{"Event":"SparkListenerTaskStart","Stage ID":3,"Stage Attempt ID":0,"Task Info":{"Task ID":3,"Index":0,"Attempt":0,"Launch Time":1629712729411,"Executor ID":"driver","Host":"localhost","Locality":"ANY","Speculative":false,"Getting Result Time":0,"Finish Time":0,"Failed":false,"Killed":false,"Accumulables":[]}}
{"Event":"SparkListenerTaskEnd","Stage ID":3,"Stage Attempt ID":0,"Task Type":"ResultTask","Task End Reason":{"Reason":"Success"},"Task Info":{"Task ID":3,"Index":0,"Attempt":0,"Launch Time":1629712729411,"Executor ID":"driver","Host":"localhost","Locality":"ANY","Speculative":false,"Getting Result Time":0,"Finish Time":1629712729462,"Failed":false,"Killed":false,"Accumulables":[{"ID":68,"Name":"aggregate time total (min, med, max)","Update":"5","Value":"4","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":65,"Name":"number of output rows","Update":"1","Value":"1","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":64,"Name":"duration total (min, med, max)","Update":"5","Value":"4","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":125,"Name":"internal.metrics.shuffle.read.recordsRead","Update":1,"Value":1,"Internal":true,"Count Failed Values":true},{"ID":124,"Name":"internal.metrics.shuffle.read.fetchWaitTime","Update":0,"Value":0,"Internal":true,"Count Failed Values":true},{"ID":123,"Name":"internal.metrics.shuffle.read.localBytesRead","Update":59,"Value":59,"Internal":true,"Count Failed Values":true},{"ID":122,"Name":"internal.metrics.shuffle.read.remoteBytesReadToDisk","Update":0,"Value":0,"Internal":true,"Count Failed Values":true},{"ID":121,"Name":"internal.metrics.shuffle.read.remoteBytesRead","Update":0,"Value":0,"Internal":true,"Count Failed Values":true},{"ID":120,"Name":"internal.metrics.shuffle.read.localBlocksFetched","Update":1,"Value":1,"Internal":true,"Count Failed Values":true},{"ID":119,"Name":"internal.metrics.shuffle.read.remoteBlocksFetched","Update":0,"Value":0,"Internal":true,"Count Failed Values":true},{"ID":114,"Name":"internal.metrics.resultSerializationTime","Update":1,"Value":1,"Internal":true,"Count Failed Values":true},{"ID":112,"Name":"internal.metrics.resultSize","Update":1825,"Value":1825,"Internal":true,"Count Failed Values":true},{"ID":111,"Name":"internal.metrics.executorCpuTime","Update":31424000,"Value":31424000,"Internal":true,"Count Failed Values":true},{"ID":110,"Name":"internal.metrics.executorRunTime","Update":32,"Value":32,"Internal":true,"Count Failed Values":true},{"ID":109,"Name":"internal.metrics.executorDeserializeCpuTime","Update":1987000,"Value":1987000,"Internal":true,"Count Failed Values":true},{"ID":108,"Name":"internal.metrics.executorDeserializeTime","Update":2,"Value":2,"Internal":true,"Count Failed Values":true}]},"Task Metrics":{"Executor Deserialize Time":2,"Executor Deserialize CPU Time":1987000,"Executor Run Time":32,"Executor CPU Time":31424000,"Result Size":1825,"JVM GC Time":0,"Result Serialization Time":1,"Memory Bytes Spilled":0,"Disk Bytes Spilled":0,"Shuffle Read Metrics":{"Remote Blocks Fetched":0,"Local Blocks Fetched":1,"Fetch Wait Time":0,"Remote Bytes Read":0,"Remote Bytes Read To Disk":0,"Local Bytes Read":59,"Total Records Read":1},"Shuffle Write Metrics":{"Shuffle Bytes Written":0,"Shuffle Write Time":0,"Shuffle Records Written":0},"Input Metrics":{"Bytes Read":0,"Records Read":0},"Output Metrics":{"Bytes Written":0,"Records Written":0},"Updated Blocks":[]}}
{"Event":"SparkListenerStageCompleted","Stage Info":{"Stage ID":3,"Stage Attempt ID":0,"Stage Name":"count at TestEventLog.scala:22","Number of Tasks":1,"RDD Info":[{"RDD ID":19,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"30\",\"name\":\"mapPartitionsInternal\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[18],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":17,"Name":"ShuffledRowRDD","Scope":"{\"id\":\"23\",\"name\":\"Exchange\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[16],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":18,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"20\",\"name\":\"WholeStageCodegen\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[17],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[2],"Details":"org.apache.spark.sql.Dataset.count(Dataset.scala:2835)\nTestEventLog$.main(TestEventLog.scala:22)\nTestEventLog.main(TestEventLog.scala)","Submission Time":1629712729403,"Completion Time":1629712729465,"Accumulables":[{"ID":110,"Name":"internal.metrics.executorRunTime","Value":32,"Internal":true,"Count Failed Values":true},{"ID":119,"Name":"internal.metrics.shuffle.read.remoteBlocksFetched","Value":0,"Internal":true,"Count Failed Values":true},{"ID":122,"Name":"internal.metrics.shuffle.read.remoteBytesReadToDisk","Value":0,"Internal":true,"Count Failed Values":true},{"ID":68,"Name":"aggregate time total (min, med, max)","Value":"4","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":125,"Name":"internal.metrics.shuffle.read.recordsRead","Value":1,"Internal":true,"Count Failed Values":true},{"ID":65,"Name":"number of output rows","Value":"1","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":124,"Name":"internal.metrics.shuffle.read.fetchWaitTime","Value":0,"Internal":true,"Count Failed Values":true},{"ID":64,"Name":"duration total (min, med, max)","Value":"4","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":109,"Name":"internal.metrics.executorDeserializeCpuTime","Value":1987000,"Internal":true,"Count Failed Values":true},{"ID":112,"Name":"internal.metrics.resultSize","Value":1825,"Internal":true,"Count Failed Values":true},{"ID":121,"Name":"internal.metrics.shuffle.read.remoteBytesRead","Value":0,"Internal":true,"Count Failed Values":true},{"ID":120,"Name":"internal.metrics.shuffle.read.localBlocksFetched","Value":1,"Internal":true,"Count Failed Values":true},{"ID":114,"Name":"internal.metrics.resultSerializationTime","Value":1,"Internal":true,"Count Failed Values":true},{"ID":123,"Name":"internal.metrics.shuffle.read.localBytesRead","Value":59,"Internal":true,"Count Failed Values":true},{"ID":108,"Name":"internal.metrics.executorDeserializeTime","Value":2,"Internal":true,"Count Failed Values":true},{"ID":111,"Name":"internal.metrics.executorCpuTime","Value":31424000,"Internal":true,"Count Failed Values":true}]}}
{"Event":"SparkListenerJobEnd","Job ID":2,"Completion Time":1629712729465,"Job Result":{"Result":"JobSucceeded"}}
{"Event":"org.apache.spark.sql.execution.ui.SparkListenerSQLExecutionEnd","executionId":1,"time":1629712729468}
{"Event":"org.apache.spark.sql.execution.ui.SparkListenerSQLExecutionStart","executionId":2,"description":"show at TestEventLog.scala:23","details":"org.apache.spark.sql.Dataset.show(Dataset.scala:719)\nTestEventLog$.main(TestEventLog.scala:23)\nTestEventLog.main(TestEventLog.scala)","physicalPlanDescription":"== Parsed Logical Plan ==\nGlobalLimit 21\n+- LocalLimit 21\n   +- Project [cast(id#10 as string) AS id#75, cast(name#11 as string) AS name#76, cast(age#12 as string) AS age#77, cast(key#13 as string) AS key#78]\n      +- Relation[id#10,name#11,age#12,key#13] csv\n\n== Analyzed Logical Plan ==\nid: string, name: string, age: string, key: string\nGlobalLimit 21\n+- LocalLimit 21\n   +- Project [cast(id#10 as string) AS id#75, cast(name#11 as string) AS name#76, cast(age#12 as string) AS age#77, cast(key#13 as string) AS key#78]\n      +- Relation[id#10,name#11,age#12,key#13] csv\n\n== Optimized Logical Plan ==\nGlobalLimit 21\n+- LocalLimit 21\n   +- Project [cast(id#10 as string) AS id#75, name#11, cast(age#12 as string) AS age#77, cast(key#13 as string) AS key#78]\n      +- InMemoryRelation [id#10, name#11, age#12, key#13], StorageLevel(disk, memory, deserialized, 1 replicas)\n            +- *(1) FileScan csv [id#10,name#11,age#12,key#13] Batched: false, Format: CSV, Location: InMemoryFileIndex[file:/Users/darren.zhang/project/workspace/darren-spark/test_file], PartitionCount: 3, PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:int,name:string,age:int>\n\n== Physical Plan ==\nCollectLimit 21\n+- *(1) Project [cast(id#10 as string) AS id#75, name#11, cast(age#12 as string) AS age#77, cast(key#13 as string) AS key#78]\n   +- InMemoryTableScan [age#12, id#10, key#13, name#11]\n         +- InMemoryRelation [id#10, name#11, age#12, key#13], StorageLevel(disk, memory, deserialized, 1 replicas)\n               +- *(1) FileScan csv [id#10,name#11,age#12,key#13] Batched: false, Format: CSV, Location: InMemoryFileIndex[file:/Users/darren.zhang/project/workspace/darren-spark/test_file], PartitionCount: 3, PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:int,name:string,age:int>","sparkPlanInfo":{"nodeName":"CollectLimit","simpleString":"CollectLimit 21","children":[{"nodeName":"WholeStageCodegen","simpleString":"WholeStageCodegen","children":[{"nodeName":"Project","simpleString":"Project [cast(id#10 as string) AS id#75, name#11, cast(age#12 as string) AS age#77, cast(key#13 as string) AS key#78]","children":[{"nodeName":"InputAdapter","simpleString":"InputAdapter","children":[{"nodeName":"InMemoryTableScan","simpleString":"InMemoryTableScan [age#12, id#10, key#13, name#11]","children":[],"metadata":{},"metrics":[{"name":"number of output rows","accumulatorId":134,"metricType":"sum"},{"name":"scan time total (min, med, max)","accumulatorId":135,"metricType":"timing"}]}],"metadata":{},"metrics":[]}],"metadata":{},"metrics":[]}],"metadata":{},"metrics":[{"name":"duration total (min, med, max)","accumulatorId":133,"metricType":"timing"}]}],"metadata":{},"metrics":[]},"time":1629712729493}
{"Event":"SparkListenerJobStart","Job ID":3,"Submission Time":1629712729533,"Stage Infos":[{"Stage ID":4,"Stage Attempt ID":0,"Stage Name":"show at TestEventLog.scala:23","Number of Tasks":1,"RDD Info":[{"RDD ID":25,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"47\",\"name\":\"map\"}","Callsite":"show at TestEventLog.scala:23","Parent IDs":[24],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":10,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"28\",\"name\":\"WholeStageCodegen\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[9],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":11,"Name":"*(1) FileScan csv [id#10,name#11,age#12,key#13] Batched: false, Format: CSV, Location: InMemoryFileIndex[file:/Users/darren.zhang/project/workspace/darren-spark/test_file], PartitionCount: 3, PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:int,name:string,age:int>\n","Scope":"{\"id\":\"27\",\"name\":\"InMemoryTableScan\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[10],"Storage Level":{"Use Disk":true,"Use Memory":true,"Deserialized":true,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":9,"Name":"FileScanRDD","Scope":"{\"id\":\"28\",\"name\":\"WholeStageCodegen\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":22,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"45\",\"name\":\"InMemoryTableScan\"}","Callsite":"show at TestEventLog.scala:23","Parent IDs":[21],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":21,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"45\",\"name\":\"InMemoryTableScan\"}","Callsite":"show at TestEventLog.scala:23","Parent IDs":[11],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":23,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"42\",\"name\":\"WholeStageCodegen\"}","Callsite":"show at TestEventLog.scala:23","Parent IDs":[22],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":24,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"46\",\"name\":\"mapPartitionsInternal\"}","Callsite":"show at TestEventLog.scala:23","Parent IDs":[23],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[],"Details":"org.apache.spark.sql.Dataset.show(Dataset.scala:719)\nTestEventLog$.main(TestEventLog.scala:23)\nTestEventLog.main(TestEventLog.scala)","Accumulables":[]}],"Stage IDs":[4],"Properties":{"spark.driver.host":"ip-172-29-6-153.ap-northeast-2.compute.internal","spark.eventLog.enabled":"true","spark.driver.port":"55465","spark.app.name":"TestEventLog","spark.executor.id":"driver","spark.master":"local","spark.eventLog.dir":"file:///Users/darren.zhang/test/","spark.sql.execution.id":"2","spark.app.id":"local-1629712726102"}}
{"Event":"SparkListenerStageSubmitted","Stage Info":{"Stage ID":4,"Stage Attempt ID":0,"Stage Name":"show at TestEventLog.scala:23","Number of Tasks":1,"RDD Info":[{"RDD ID":25,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"47\",\"name\":\"map\"}","Callsite":"show at TestEventLog.scala:23","Parent IDs":[24],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":10,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"28\",\"name\":\"WholeStageCodegen\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[9],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":11,"Name":"*(1) FileScan csv [id#10,name#11,age#12,key#13] Batched: false, Format: CSV, Location: InMemoryFileIndex[file:/Users/darren.zhang/project/workspace/darren-spark/test_file], PartitionCount: 3, PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:int,name:string,age:int>\n","Scope":"{\"id\":\"27\",\"name\":\"InMemoryTableScan\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[10],"Storage Level":{"Use Disk":true,"Use Memory":true,"Deserialized":true,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":9,"Name":"FileScanRDD","Scope":"{\"id\":\"28\",\"name\":\"WholeStageCodegen\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":22,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"45\",\"name\":\"InMemoryTableScan\"}","Callsite":"show at TestEventLog.scala:23","Parent IDs":[21],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":21,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"45\",\"name\":\"InMemoryTableScan\"}","Callsite":"show at TestEventLog.scala:23","Parent IDs":[11],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":23,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"42\",\"name\":\"WholeStageCodegen\"}","Callsite":"show at TestEventLog.scala:23","Parent IDs":[22],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":24,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"46\",\"name\":\"mapPartitionsInternal\"}","Callsite":"show at TestEventLog.scala:23","Parent IDs":[23],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[],"Details":"org.apache.spark.sql.Dataset.show(Dataset.scala:719)\nTestEventLog$.main(TestEventLog.scala:23)\nTestEventLog.main(TestEventLog.scala)","Submission Time":1629712729534,"Accumulables":[]},"Properties":{"spark.driver.host":"ip-172-29-6-153.ap-northeast-2.compute.internal","spark.eventLog.enabled":"true","spark.driver.port":"55465","spark.app.name":"TestEventLog","spark.executor.id":"driver","spark.master":"local","spark.eventLog.dir":"file:///Users/darren.zhang/test/","spark.sql.execution.id":"2","spark.app.id":"local-1629712726102"}}
{"Event":"SparkListenerTaskStart","Stage ID":4,"Stage Attempt ID":0,"Task Info":{"Task ID":4,"Index":0,"Attempt":0,"Launch Time":1629712729540,"Executor ID":"driver","Host":"localhost","Locality":"PROCESS_LOCAL","Speculative":false,"Getting Result Time":0,"Finish Time":0,"Failed":false,"Killed":false,"Accumulables":[]}}
{"Event":"SparkListenerTaskEnd","Stage ID":4,"Stage Attempt ID":0,"Task Type":"ResultTask","Task End Reason":{"Reason":"Success"},"Task Info":{"Task ID":4,"Index":0,"Attempt":0,"Launch Time":1629712729540,"Executor ID":"driver","Host":"localhost","Locality":"PROCESS_LOCAL","Speculative":false,"Getting Result Time":0,"Finish Time":1629712729570,"Failed":false,"Killed":false,"Accumulables":[{"ID":133,"Name":"duration total (min, med, max)","Update":"2","Value":"1","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":134,"Name":"number of output rows","Update":"6","Value":"6","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":158,"Name":"internal.metrics.input.recordsRead","Update":1,"Value":1,"Internal":true,"Count Failed Values":true},{"ID":157,"Name":"internal.metrics.input.bytesRead","Update":624,"Value":624,"Internal":true,"Count Failed Values":true},{"ID":140,"Name":"internal.metrics.resultSize","Update":1655,"Value":1655,"Internal":true,"Count Failed Values":true},{"ID":139,"Name":"internal.metrics.executorCpuTime","Update":22736000,"Value":22736000,"Internal":true,"Count Failed Values":true},{"ID":138,"Name":"internal.metrics.executorRunTime","Update":24,"Value":24,"Internal":true,"Count Failed Values":true},{"ID":137,"Name":"internal.metrics.executorDeserializeCpuTime","Update":3428000,"Value":3428000,"Internal":true,"Count Failed Values":true},{"ID":136,"Name":"internal.metrics.executorDeserializeTime","Update":4,"Value":4,"Internal":true,"Count Failed Values":true}]},"Task Metrics":{"Executor Deserialize Time":4,"Executor Deserialize CPU Time":3428000,"Executor Run Time":24,"Executor CPU Time":22736000,"Result Size":1655,"JVM GC Time":0,"Result Serialization Time":0,"Memory Bytes Spilled":0,"Disk Bytes Spilled":0,"Shuffle Read Metrics":{"Remote Blocks Fetched":0,"Local Blocks Fetched":0,"Fetch Wait Time":0,"Remote Bytes Read":0,"Remote Bytes Read To Disk":0,"Local Bytes Read":0,"Total Records Read":0},"Shuffle Write Metrics":{"Shuffle Bytes Written":0,"Shuffle Write Time":0,"Shuffle Records Written":0},"Input Metrics":{"Bytes Read":624,"Records Read":1},"Output Metrics":{"Bytes Written":0,"Records Written":0},"Updated Blocks":[]}}
{"Event":"SparkListenerStageCompleted","Stage Info":{"Stage ID":4,"Stage Attempt ID":0,"Stage Name":"show at TestEventLog.scala:23","Number of Tasks":1,"RDD Info":[{"RDD ID":25,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"47\",\"name\":\"map\"}","Callsite":"show at TestEventLog.scala:23","Parent IDs":[24],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":10,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"28\",\"name\":\"WholeStageCodegen\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[9],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":11,"Name":"*(1) FileScan csv [id#10,name#11,age#12,key#13] Batched: false, Format: CSV, Location: InMemoryFileIndex[file:/Users/darren.zhang/project/workspace/darren-spark/test_file], PartitionCount: 3, PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:int,name:string,age:int>\n","Scope":"{\"id\":\"27\",\"name\":\"InMemoryTableScan\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[10],"Storage Level":{"Use Disk":true,"Use Memory":true,"Deserialized":true,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":9,"Name":"FileScanRDD","Scope":"{\"id\":\"28\",\"name\":\"WholeStageCodegen\"}","Callsite":"count at TestEventLog.scala:22","Parent IDs":[],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":22,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"45\",\"name\":\"InMemoryTableScan\"}","Callsite":"show at TestEventLog.scala:23","Parent IDs":[21],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":21,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"45\",\"name\":\"InMemoryTableScan\"}","Callsite":"show at TestEventLog.scala:23","Parent IDs":[11],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":23,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"42\",\"name\":\"WholeStageCodegen\"}","Callsite":"show at TestEventLog.scala:23","Parent IDs":[22],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":24,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"46\",\"name\":\"mapPartitionsInternal\"}","Callsite":"show at TestEventLog.scala:23","Parent IDs":[23],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":1,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[],"Details":"org.apache.spark.sql.Dataset.show(Dataset.scala:719)\nTestEventLog$.main(TestEventLog.scala:23)\nTestEventLog.main(TestEventLog.scala)","Submission Time":1629712729534,"Completion Time":1629712729571,"Accumulables":[{"ID":137,"Name":"internal.metrics.executorDeserializeCpuTime","Value":3428000,"Internal":true,"Count Failed Values":true},{"ID":158,"Name":"internal.metrics.input.recordsRead","Value":1,"Internal":true,"Count Failed Values":true},{"ID":140,"Name":"internal.metrics.resultSize","Value":1655,"Internal":true,"Count Failed Values":true},{"ID":134,"Name":"number of output rows","Value":"6","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":133,"Name":"duration total (min, med, max)","Value":"1","Internal":true,"Count Failed Values":true,"Metadata":"sql"},{"ID":136,"Name":"internal.metrics.executorDeserializeTime","Value":4,"Internal":true,"Count Failed Values":true},{"ID":157,"Name":"internal.metrics.input.bytesRead","Value":624,"Internal":true,"Count Failed Values":true},{"ID":139,"Name":"internal.metrics.executorCpuTime","Value":22736000,"Internal":true,"Count Failed Values":true},{"ID":138,"Name":"internal.metrics.executorRunTime","Value":24,"Internal":true,"Count Failed Values":true}]}}
{"Event":"SparkListenerJobEnd","Job ID":3,"Completion Time":1629712729571,"Job Result":{"Result":"JobSucceeded"}}
{"Event":"org.apache.spark.sql.execution.ui.SparkListenerSQLExecutionEnd","executionId":2,"time":1629712729573}
{"Event":"SparkListenerApplicationEnd","Timestamp":1629712729577}

注意:第一个Event是SparkListenerLogStart,最后一个Event是SparkListenerApplicationEnd

简单分析Spark Event Log

可以看到Spark Event Log是JSON文件,我们可以使用Spark读取一下,简单分析一下Spark Event Log。

import org.apache.spark.sql.SparkSession

object ReadEventLog {

    def main(args: Array[String]): Unit = {
        val spark = SparkSession.builder().master("local").appName("readEventLog").getOrCreate()

        val df = spark.read.json("file:///Users/darren.zhang/test/local-1629712726102")
        df.printSchema()

        df.select("Event").groupBy("Event").count.show(20, false)

        spark.stop()
    }
}
root
 |-- App ID: string (nullable = true)
 |-- App Name: string (nullable = true)
 |-- Block Manager ID: struct (nullable = true)
 |    |-- Executor ID: string (nullable = true)
 |    |-- Host: string (nullable = true)
 |    |-- Port: long (nullable = true)
 |-- Classpath Entries: struct (nullable = true)
 |    |-- /Users/darren.zhang/.gradle/caches/modules-2/files-2.1/com.fasterxml.jackson.core/jackson-core/2.7.9/9b530cec4fd2eb841ab8e79f19fc7cf0ec487b2/jackson-core-2.7.9.jar: string (nullable = true)
 |-- Completion Time: long (nullable = true)
 |-- Event: string (nullable = true)
 |-- Executor ID: string (nullable = true)
 |-- Executor Info: struct (nullable = true)
 |    |-- Host: string (nullable = true)
 |    |-- Total Cores: long (nullable = true)
 |-- JVM Information: struct (nullable = true)
 |    |-- Java Home: string (nullable = true)
 |    |-- Java Version: string (nullable = true)
 |    |-- Scala Version: string (nullable = true)
 |-- Job ID: long (nullable = true)
 |-- Job Result: struct (nullable = true)
 |    |-- Result: string (nullable = true)
 |-- Maximum Memory: long (nullable = true)
 |-- Maximum Offheap Memory: long (nullable = true)
 |-- Maximum Onheap Memory: long (nullable = true)
 |-- Properties: struct (nullable = true)
 |    |-- spark.app.id: string (nullable = true)
 |    |-- spark.app.name: string (nullable = true)
 |-- Spark Properties: struct (nullable = true)
 |    |-- spark.app.id: string (nullable = true)
 |    |-- spark.app.name: string (nullable = true)
 |-- Spark Version: string (nullable = true)
 |-- Stage Attempt ID: long (nullable = true)
 |-- Stage ID: long (nullable = true)
 |-- Stage IDs: array (nullable = true)
 |    |-- element: long (containsNull = true)
 |-- Stage Info: struct (nullable = true)
 |    |-- Accumulables: array (nullable = true)
 |    |    |-- element: struct (containsNull = true)
 |    |    |    |-- Count Failed Values: boolean (nullable = true)
 |    |    |    |-- ID: long (nullable = true)
 |    |    |    |-- Internal: boolean (nullable = true)
 |    |    |    |-- Metadata: string (nullable = true)
 |    |    |    |-- Name: string (nullable = true)
 |    |    |    |-- Value: string (nullable = true)
 |    |-- Completion Time: long (nullable = true)
 |    |-- Details: string (nullable = true)
 |    |-- Number of Tasks: long (nullable = true)
 |    |-- Parent IDs: array (nullable = true)
 |    |    |-- element: long (containsNull = true)
 |    |-- RDD Info: array (nullable = true)
 |    |    |-- element: struct (containsNull = true)
 |    |    |    |-- Callsite: string (nullable = true)
 |    |    |    |-- Disk Size: long (nullable = true)
 |    |    |    |-- Memory Size: long (nullable = true)
 |    |    |    |-- Name: string (nullable = true)
 |    |    |    |-- Number of Cached Partitions: long (nullable = true)
 |    |    |    |-- Number of Partitions: long (nullable = true)
 |    |    |    |-- Parent IDs: array (nullable = true)
 |    |    |    |    |-- element: long (containsNull = true)
 |    |    |    |-- RDD ID: long (nullable = true)
 |    |    |    |-- Scope: string (nullable = true)
 |    |    |    |-- Storage Level: struct (nullable = true)
 |    |    |    |    |-- Deserialized: boolean (nullable = true)
 |    |    |    |    |-- Replication: long (nullable = true)
 |    |    |    |    |-- Use Disk: boolean (nullable = true)
 |    |    |    |    |-- Use Memory: boolean (nullable = true)
 |    |-- Stage Attempt ID: long (nullable = true)
 |    |-- Stage ID: long (nullable = true)
 |    |-- Stage Name: string (nullable = true)
 |    |-- Submission Time: long (nullable = true)
 |-- Stage Infos: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- Accumulables: array (nullable = true)
 |    |    |    |-- element: string (containsNull = true)
 |    |    |-- Details: string (nullable = true)
 |    |    |-- Number of Tasks: long (nullable = true)
 |    |    |-- Parent IDs: array (nullable = true)
 |    |    |    |-- element: long (containsNull = true)
 |    |    |-- RDD Info: array (nullable = true)
 |    |    |    |-- element: struct (containsNull = true)
 |    |    |    |    |-- Callsite: string (nullable = true)
 |    |    |    |    |-- Disk Size: long (nullable = true)
 |    |    |    |    |-- Memory Size: long (nullable = true)
 |    |    |    |    |-- Name: string (nullable = true)
 |    |    |    |    |-- Number of Cached Partitions: long (nullable = true)
 |    |    |    |    |-- Number of Partitions: long (nullable = true)
 |    |    |    |    |-- Parent IDs: array (nullable = true)
 |    |    |    |    |    |-- element: long (containsNull = true)
 |    |    |    |    |-- RDD ID: long (nullable = true)
 |    |    |    |    |-- Scope: string (nullable = true)
 |    |    |    |    |-- Storage Level: struct (nullable = true)
 |    |    |    |    |    |-- Deserialized: boolean (nullable = true)
 |    |    |    |    |    |-- Replication: long (nullable = true)
 |    |    |    |    |    |-- Use Disk: boolean (nullable = true)
 |    |    |    |    |    |-- Use Memory: boolean (nullable = true)
 |    |    |-- Stage Attempt ID: long (nullable = true)
 |    |    |-- Stage ID: long (nullable = true)
 |    |    |-- Stage Name: string (nullable = true)
 |-- Submission Time: long (nullable = true)
 |-- System Properties: struct (nullable = true)
 |    |-- awt.toolkit: string (nullable = true)
 |-- Task End Reason: struct (nullable = true)
 |    |-- Reason: string (nullable = true)
 |-- Task Info: struct (nullable = true)
 |    |-- Accumulables: array (nullable = true)
 |    |    |-- element: struct (containsNull = true)
 |    |    |    |-- Count Failed Values: boolean (nullable = true)
 |    |    |    |-- ID: long (nullable = true)
 |    |    |    |-- Internal: boolean (nullable = true)
 |    |    |    |-- Metadata: string (nullable = true)
 |    |    |    |-- Name: string (nullable = true)
 |    |    |    |-- Update: string (nullable = true)
 |    |    |    |-- Value: string (nullable = true)
 |    |-- Attempt: long (nullable = true)
 |    |-- Executor ID: string (nullable = true)
 |    |-- Failed: boolean (nullable = true)
 |    |-- Finish Time: long (nullable = true)
 |    |-- Getting Result Time: long (nullable = true)
 |    |-- Host: string (nullable = true)
 |    |-- Index: long (nullable = true)
 |    |-- Killed: boolean (nullable = true)
 |    |-- Launch Time: long (nullable = true)
 |    |-- Locality: string (nullable = true)
 |    |-- Speculative: boolean (nullable = true)
 |    |-- Task ID: long (nullable = true)
 |-- Task Metrics: struct (nullable = true)
 |    |-- Disk Bytes Spilled: long (nullable = true)
 |    |-- Executor CPU Time: long (nullable = true)
 |    |-- Executor Deserialize CPU Time: long (nullable = true)
 |    |-- Executor Deserialize Time: long (nullable = true)
 |    |-- Executor Run Time: long (nullable = true)
 |    |-- Input Metrics: struct (nullable = true)
 |    |    |-- Bytes Read: long (nullable = true)
 |    |    |-- Records Read: long (nullable = true)
 |    |-- JVM GC Time: long (nullable = true)
 |    |-- Memory Bytes Spilled: long (nullable = true)
 |    |-- Output Metrics: struct (nullable = true)
 |    |    |-- Bytes Written: long (nullable = true)
 |    |    |-- Records Written: long (nullable = true)
 |    |-- Result Serialization Time: long (nullable = true)
 |    |-- Result Size: long (nullable = true)
 |    |-- Shuffle Read Metrics: struct (nullable = true)
 |    |    |-- Fetch Wait Time: long (nullable = true)
 |    |    |-- Local Blocks Fetched: long (nullable = true)
 |    |    |-- Local Bytes Read: long (nullable = true)
 |    |    |-- Remote Blocks Fetched: long (nullable = true)
 |    |    |-- Remote Bytes Read: long (nullable = true)
 |    |    |-- Remote Bytes Read To Disk: long (nullable = true)
 |    |    |-- Total Records Read: long (nullable = true)
 |    |-- Shuffle Write Metrics: struct (nullable = true)
 |    |    |-- Shuffle Bytes Written: long (nullable = true)
 |    |    |-- Shuffle Records Written: long (nullable = true)
 |    |    |-- Shuffle Write Time: long (nullable = true)
 |    |-- Updated Blocks: array (nullable = true)
 |    |    |-- element: string (containsNull = true)
 |-- Task Type: string (nullable = true)
 |-- Timestamp: long (nullable = true)
 |-- User: string (nullable = true)
 |-- accumUpdates: array (nullable = true)
 |    |-- element: array (containsNull = true)
 |    |    |-- element: long (containsNull = true)
 |-- description: string (nullable = true)
 |-- details: string (nullable = true)
 |-- executionId: long (nullable = true)
 |-- physicalPlanDescription: string (nullable = true)
 |-- sparkPlanInfo: struct (nullable = true)
 |    |-- children: array (nullable = true)
 |    |    |-- element: struct (containsNull = true)
 |    |-- metrics: array (nullable = true)
 |    |    |-- element: struct (containsNull = true)
 |    |    |    |-- accumulatorId: long (nullable = true)
 |    |    |    |-- metricType: string (nullable = true)
 |    |    |    |-- name: string (nullable = true)
 |    |-- nodeName: string (nullable = true)
 |    |-- simpleString: string (nullable = true)
 |-- time: long (nullable = true)

Schema太长,删掉了一部分。

+-----------------------------------------------------------------+-----+
|Event                                                            |count|
+-----------------------------------------------------------------+-----+
|org.apache.spark.sql.execution.ui.SparkListenerSQLExecutionEnd   |3    |
|SparkListenerTaskStart                                           |5    |
|SparkListenerBlockManagerAdded                                   |1    |
|SparkListenerJobStart                                            |4    |
|SparkListenerStageCompleted                                      |5    |
|SparkListenerJobEnd                                              |4    |
|SparkListenerApplicationEnd                                      |1    |
|org.apache.spark.sql.execution.ui.SparkListenerDriverAccumUpdates|2    |
|SparkListenerLogStart                                            |1    |
|SparkListenerExecutorAdded                                       |1    |
|org.apache.spark.sql.execution.ui.SparkListenerSQLExecutionStart |3    |
|SparkListenerEnvironmentUpdate                                   |1    |
|SparkListenerStageSubmitted                                      |5    |
|SparkListenerTaskEnd                                             |5    |
|SparkListenerApplicationStart                                    |1    |
+-----------------------------------------------------------------+-----+

Spark Event Log在History Server显示是什么样的

在这里插入图片描述
在这里插入图片描述

是谁在写Event Log

org.apache.spark.scheduler.EventLoggingListener,我们来看一下写日志的方法:

  /** Log the event as JSON. */
  private def logEvent(event: SparkListenerEvent, flushLogger: Boolean = false) {
    val eventJson = JsonProtocol.sparkEventToJson(event)
    // scalastyle:off println
    // 把JSON格式的日志写到OutPutStream中
    writer.foreach(_.println(compact(render(eventJson))))
    // scalastyle:on println
    // 把OutPutStream中的日志写到磁盘文件
    if (flushLogger) {
      writer.foreach(_.flush())
      hadoopDataStream.foreach(_.hflush())
    }
    if (testing) {
      loggedEvents += eventJson
    }
  }

下一节来看EventLoggingListener具体是怎么工作的。

参考

  1. https://github.com/LucaCanali/Miscellaneous/blob/master/Spark_Notes/Spark_EventLog.md
  2. https://blog.csdn.net/littlePYR/article/details/104621255
  大数据 最新文章
实现Kafka至少消费一次
亚马逊云科技:还在苦于ETL?Zero ETL的时代
初探MapReduce
【SpringBoot框架篇】32.基于注解+redis实现
Elasticsearch:如何减少 Elasticsearch 集
Go redis操作
Redis面试题
专题五 Redis高并发场景
基于GBase8s和Calcite的多数据源查询
Redis——底层数据结构原理
上一篇文章      下一篇文章      查看所有文章
加:2021-08-24 15:37:49  更:2021-08-24 15:38:55 
 
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁

360图书馆 购物 三丰科技 阅读网 日历 万年历 2024年11日历 -2024/11/23 13:13:37-

图片自动播放器
↓图片自动播放器↓
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
图片批量下载器
↓批量下载图片,美女图库↓
  网站联系: qq:121756557 email:121756557@qq.com  IT数码