0
問題:
我已經在RDD形式Array[Array[String]]
,我需要內陣列中串的組合。但是,當我申請地圖功能我收到以下錯誤
java.io.NotSerializableException: scala.collection.TraversableOnce$FlattenOps$$anon$1
Serialization stack:
- object not serializable (class: scala.collection.TraversableOnce$FlattenOps$$anon$1, value: non-empty iterator)
- element of array (index: 0)
- array (class [Lscala.collection.Iterator;, size 10)
at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:324)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
背景:
起初我有以下:
Array[org.apache.spark.sql.Row] = Array([cyber crimes ;; cyber security ;; review ;; india ;; instances ;; state ;; issue], [civil rights ;; case ;; instances ;; frequency])
當我用下面的代碼清洗這樣的:
words.map(r => r(0).asInstanceOf[String].split("\\;;").map(_.trim))
其結果如下:
Array[Array[String]] = Array(Array(cyber crimes, cyber security, review, india, instances, state, issue), Array(civil society, instances, frequency))
現在我需要像字符串數組的所有可能的組合:
Array[Array[String]] = Array(Array((cyber crimes, cyber security), (review, india), (instances, state), (issue,cyber crimes))....etc)
當我向其施加map
它給我上面的錯誤:
val combinations = cleanwords.map(r => r(0).asInstanceOf[String].combinations(2))
誰能幫助我得到這個期望的結果?