-1
我正在努力獲得2數據幀的CROSS JOIN。我正在使用spark 2.0。如何使用2個數據框來實現CROSSS JOIN?如何交叉連接2數據幀?
編輯:
val df=df.join(df_t1, df("Col1")===df_t1("col")).join(df2,joinType=="cross join").where(df("col2")===df2("col2"))
我正在努力獲得2數據幀的CROSS JOIN。我正在使用spark 2.0。如何使用2個數據框來實現CROSSS JOIN?如何交叉連接2數據幀?
編輯:
val df=df.join(df_t1, df("Col1")===df_t1("col")).join(df2,joinType=="cross join").where(df("col2")===df2("col2"))
呼叫加入與其他數據幀,而無需使用連接條件。
看看下面的示例。 鑑於以人爲本數據框:區域
+---+------+-------+------+
| id| name| mail|idArea|
+---+------+-------+------+
| 1| Jack|[email protected]| 1|
| 2|Valery|[email protected]| 1|
| 3| Karl|[email protected]| 2|
| 4| Nick|[email protected]| 2|
| 5| Luke|[email protected]| 3|
| 6| Marek|[email protected]| 3|
+---+------+-------+------+
和第二數據幀:
+------+--------------+
|idArea| areaName|
+------+--------------+
| 1|Amministration|
| 2| Public|
| 3| Store|
+------+--------------+
的CROSS JOIN是簡單地由下式給出:
val cross = people.join(area)
+---+------+-------+------+------+--------------+
| id| name| mail|idArea|idArea| areaName|
+---+------+-------+------+------+--------------+
| 1| Jack|[email protected]| 1| 1|Amministration|
| 1| Jack|[email protected]| 1| 3| Store|
| 1| Jack|[email protected]| 1| 2| Public|
| 2|Valery|[email protected]| 1| 1|Amministration|
| 2|Valery|[email protected]| 1| 3| Store|
| 2|Valery|[email protected]| 1| 2| Public|
| 3| Karl|[email protected]| 2| 1|Amministration|
| 3| Karl|[email protected]| 2| 2| Public|
| 3| Karl|[email protected]| 2| 3| Store|
| 4| Nick|[email protected]| 2| 3| Store|
| 4| Nick|[email protected]| 2| 2| Public|
| 4| Nick|[email protected]| 2| 1|Amministration|
| 5| Luke|[email protected]| 3| 2| Public|
| 5| Luke|[email protected]| 3| 3| Store|
| 5| Luke|[email protected]| 3| 1|Amministration|
| 6| Marek|[email protected]| 3| 1|Amministration|
| 6| Marek|[email protected]| 3| 2| Public|
| 6| Marek|[email protected]| 3| 3| Store|
+---+------+-------+------+------+--------------+
升級到最新的火花sql_2的版本.11版本2.1.0並使用函數.crossJoin數據集
向我們展示您嘗試過的。 ... –
val df = df.join(df_t1,df(「Col1」)=== df_t1(「col」))。join(df2,joinType ==「cross join」)其中(df(「col2」)) === DF2( 「COL2」)) – Miruthan