我有下面的Dataframe,並且我想僅使用RDD將其展平。任何人都可以幫忙嗎?Scala/Spark:使用RDD唯一功能壓扁DataFrame
輸入數據幀:
+---------+-------------+-----------------+-----+----------------+------------------------------------------------------+ |TPNB |unitOfMeasure|locationReference|types|types |effectiveDateTime | +---------+-------------+-----------------+-----+----------------+------------------------------------------------------+ |079562193|EA |0810 |STORE|[SELLABLE, HELD]|[2015-10-09T00:55:23.6345Z, 2015-10-09T00:55:23.6345Z]| +---------+-------------+-----------------+-----+----------------+------------------------------------------------------+
輸出:
TPNB unitOfMeasure locationReference types types effectiveDateTime 079562193 EA 0810 STORE SELLABLE 2015-10-09T00:55:23.6345Z 079562193 EA 0810 STORE HELD 2015-10-09T00:55:23.6345Z
我是想這樣的事情,這犯規似乎是工作。
final_output.map(value=>((value(0),value(1),value(2),value(3)),value(5),value(6))).map{ case(key,value)=>value.map(records=>(key,records)) }
'final_output.rdd'應該給你rdd數據,你有沒有試過? –
是的,我試過了。它沒有工作 –
當你使用.rdd時,問題是什麼? –