2016-11-04 48 views
2

我有一個驅動程序,我使用spark從Cassandra讀取數據,執行一些操作,然後在S3上寫出JSON。當我使用Spark 1.6.1和spark-cassandra-connector 1.6.0-M1時,程序運行良好。但是,如果我嘗試升級到Spark 2.0.1(hadoop 2.7.1)和spark-cassandra-connector 2.0.0-M3,程序將以所有預期文件寫入S3的方式完成,但程序永遠不會終止。Spark從Cassandra寫入json後掛起

我在程序結束時運行sc.stop()。我也在使用Mesos 1.0.1。在這兩種情況下,我都使用默認輸出提交者。

編輯:看下面的線程轉儲,現在看來似乎可能會等上:org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner

代碼片段:

// get MongoDB oplog operations 
val operations = sc.cassandraTable[JsonOperation](keyspace, namespace) 
    .where("ts >= ? AND ts < ?", minTimestamp, maxTimestamp) 

// replay oplog operations into documents 
val documents = operations 
    .spanBy(op => op.id) 
    .map { case (id: String, ops: Iterable[T]) => (id, apply(ops)) } 
    .filter { case (id, result) => result.isInstanceOf[Document] } 
    .map { case (id, document) => MergedDocument(id = id, document = document 
    .asInstanceOf[Document]) 
    } 

// write documents to json on s3 
documents 
    .map(document => document.toJson) 
    .coalesce(partitions) 
    .saveAsTextFile(path, classOf[GzipCodec]) 
sc.stop() 

線程轉儲的驅動程序:

60 context-cleaner-periodic-gc TIMED_WAITING 
46 dag-scheduler-event-loop WAITING 
4389 DestroyJavaVM RUNNABLE 
12 dispatcher-event-loop-0 WAITING 
13 dispatcher-event-loop-1 WAITING 
14 dispatcher-event-loop-2 WAITING 
15 dispatcher-event-loop-3 WAITING 
47 driver-revive-thread TIMED_WAITING 
3 Finalizer WAITING 
82 ForkJoinPool-1-worker-17 WAITING 
43 heartbeat-receiver-event-loop-thread TIMED_WAITING 
93 java-sdk-http-connection-reaper TIMED_WAITING 
4387 java-sdk-progress-listener-callback-thread WAITING 
25 map-output-dispatcher-0 WAITING 
26 map-output-dispatcher-1 WAITING 
27 map-output-dispatcher-2 WAITING 
28 map-output-dispatcher-3 WAITING 
29 map-output-dispatcher-4 WAITING 
30 map-output-dispatcher-5 WAITING 
31 map-output-dispatcher-6 WAITING 
32 map-output-dispatcher-7 WAITING 
48 MesosCoarseGrainedSchedulerBackend-mesos-driver RUNNABLE 
44 netty-rpc-env-timeout TIMED_WAITING 
92 org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner WAITING 
62 pool-19-thread-1 TIMED_WAITING 
2 Reference Handler WAITING 
61 Scheduler-1112394071 TIMED_WAITING 
20 shuffle-server-0 RUNNABLE 
55 shuffle-server-0 RUNNABLE 
21 shuffle-server-1 RUNNABLE 
56 shuffle-server-1 RUNNABLE 
22 shuffle-server-2 RUNNABLE 
57 shuffle-server-2 RUNNABLE 
23 shuffle-server-3 RUNNABLE 
58 shuffle-server-3 RUNNABLE 
4 Signal Dispatcher RUNNABLE 
59 Spark Context Cleaner TIMED_WAITING 
9 SparkListenerBus WAITING 
35 [email protected]/0 RUNNABLE 
36 [email protected]@3b5eaf92{HTTP/1.1}{0.0.0.0:4040} RUNNABLE 
37 [email protected]/1 RUNNABLE 
38 SparkUI-38 TIMED_WAITING 
39 SparkUI-39 TIMED_WAITING 
40 SparkUI-40 TIMED_WAITING 
41 SparkUI-41 RUNNABLE 
42 SparkUI-42 TIMED_WAITING 
438 task-result-getter-0 WAITING 
450 task-result-getter-1 WAITING 
489 task-result-getter-2 WAITING 
492 task-result-getter-3 WAITING 
75 threadDeathWatcher-2-1 TIMED_WAITING 
45 Timer-0 WAITING 

線程轉儲的執行者。這是對所有的人都一樣:

24 dispatcher-event-loop-0 WAITING 
25 dispatcher-event-loop-1 WAITING 
26 dispatcher-event-loop-2 RUNNABLE 
27 dispatcher-event-loop-3 WAITING 
39 driver-heartbeater TIMED_WAITING 
3 Finalizer WAITING 
58 java-sdk-http-connection-reaper TIMED_WAITING 
75 java-sdk-progress-listener-callback-thread WAITING 
1 main TIMED_WAITING 
33 netty-rpc-env-timeout TIMED_WAITING 
55 org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner WAITING 
59 pool-17-thread-1 TIMED_WAITING 
2 Reference Handler WAITING 
28 shuffle-client-0 RUNNABLE 
35 shuffle-client-0 RUNNABLE 
41 shuffle-client-0 RUNNABLE 
37 shuffle-server-0 RUNNABLE 
5 Signal Dispatcher RUNNABLE 
23 threadDeathWatcher-2-1 TIMED_WAITING 
+0

你能SH向我們提供你的代碼? –

+1

你可以使用Spark UI/executors頁面的「線程轉儲」,並檢查當時正在做的工作。 – maasg

回答

0

我解決了這個在我的程序罐子更新以下軟件包:

  • 火花2.0.0至2.0.1
  • json4s 3.2.11到3.5.0
  • 扇貝2.0.1至2.0.5
  • nscala時間1.8.0至2.14.0