0

它給了我java.lang.ClassCastException:[Ljava.util.UUID;不能轉換爲[Ljava.lang.String;Postgresql UUID []給Cassandra:轉換錯誤

我的作業從包含user_ids uuid[]類型列的PostgreSQL表中讀取數據,以便在我試圖在Cassandra上保存數據時收到上述錯誤。

但是,在Cassandra上創建這個同樣的表格工作正常! user_ids list<text>

我無法更改源表上的類型,因爲我正在從舊系統讀取數據。

我一直在尋找印在日誌點,階級org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils.scala

case StringType => 
     (array: Object) => 
     array.asInstanceOf[Array[java.lang.String]] 
      .map(UTF8String.fromString)``` 

Debug view


堆棧跟蹤

Caused by: java.lang.ClassCastException: [Ljava.util.UUID; cannot be cast to [Ljava.lang.String; 
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$14.apply(JdbcUtils.scala:443) 
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$14.apply(JdbcUtils.scala:442) 
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeGetter$13$$anonfun$18.apply(JdbcUtils.scala:472) 
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeGetter$13$$anonfun$18.apply(JdbcUtils.scala:472) 
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$nullSafeConvert(JdbcUtils.scala:482) 
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeGetter$13.apply(JdbcUtils.scala:470) 
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeGetter$13.apply(JdbcUtils.scala:469) 
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:330) 
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:312) 
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) 
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) 
at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) 
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) 
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) 
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) 
at org.apache.spark.sql.execution.columnar.InMemoryRelation$$anonfun$1$$anon$1.hasNext(InMemoryRelation.scala:133) 
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:215) 
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1038) 
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1029) 
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:969) 
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1029) 
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:760) 
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334) 
at org.apache.spark.rdd.RDD.iterator(RDD.scala:285) 
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) 
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) 
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) 
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) 
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) 
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) 
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) 
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) 
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) 
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) 
at org.apache.spark.scheduler.Task.run(Task.scala:108) 
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:748) 
+0

可以嘗試拋出數組 - > array.asInstanceOf [Array [UUID]],然後嘗試將此新數組轉換爲字符串,即newArray.map(_。toString) –

回答

0

請看CQL這裏的數據類型支持。

您應該在表架構中創建了list<uuid>而不是list<text>。 Java驅動程序無法自動處理此轉換。

如果您想要使用text,請在將它發送到驅動程序之前將其轉換爲字符串。

+0

謝謝!我會這樣做,但我試圖只使用一個作業來完成這種轉換,並且該類型是由驅動程序自動創建的。這很可悲,但我必須爲這張桌子寫一份具體的工作。再次感謝。 –

0

您在數據庫中存儲user_id的值是UUID類型,java中的相同類型的類型爲java.util.UUID。

所以不是使用java.lang.String,而應該使用java.util.UUID數組或列表,然後存儲在cassandra uuid_obj.toString()中以存儲在Cassandra中。