2015-12-02 113 views
1

我使用spark-streaming-kafka檢查點將已處理的Kafka偏移量存儲到HDFS中的文件夾中,重新啓動應用程序(使用spark-submit)後檢查恢復,我得到ClassNotFoundException屬於spark-streaming-kafka模塊,並被打包到我的應用程序uber jar中。好像這個類沒有在我的應用程序jar中查找。從流kafka檢查點恢復ClassNotFoundException

使用V1.5.1

15/12/02 15:42:30 INFO streaming.CheckpointReader: Attempting to load checkpoint from file hdfs://ip-xxx-xx-xx-xx:8020/user/checkpoint-1449064500000 
15/12/02 15:42:30 WARN streaming.CheckpointReader: Error reading checkpoint from file hdfs://ip-xxx-xx-xx-xx:8020/user/checkpoint-1449064500000 
java.io.IOException: java.lang.ClassNotFoundException: org.apache.spark.streaming.kafka.OffsetRange 
    at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1163) 
    at org.apache.spark.streaming.DStreamGraph.readObject(DStreamGraph.scala:188) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) 
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) 
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) 
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) 
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) 
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) 
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) 
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) 
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) 
    at org.apache.spark.streaming.Checkpoint$$anonfun$deserialize$2.apply(Checkpoint.scala:151) 
    at org.apache.spark.streaming.Checkpoint$$anonfun$deserialize$2.apply(Checkpoint.scala:141) 
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1206) 
    at org.apache.spark.streaming.Checkpoint$.deserialize(Checkpoint.scala:154) 
    at org.apache.spark.streaming.CheckpointReader$$anonfun$read$2.apply(Checkpoint.scala:329) 
    at org.apache.spark.streaming.CheckpointReader$$anonfun$read$2.apply(Checkpoint.scala:325) 
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) 
    at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35) 
    at org.apache.spark.streaming.CheckpointReader$.read(Checkpoint.scala:325) 
    at org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:852) 
... 

回答

1

更新:發現有這個開放的錯誤 - SPARK-5569(https://github.com/apache/spark/pull/8955)。

在建議的提交和構建spark-assembly中應用代碼更改後,它現在可以工作。

+0

請分享你做了哪些代碼更改以從中恢復?非常感謝 – nish

相關問題