2017-01-27 81 views
0

我做內沒有得到適當的列加入了火花dataframes SQL查詢的相似丁文內加入火花dataframes預期

SELECT DISTINCT a.aid,a.DId,a.BM,a.BY,b.TO FROM GetRaw a 
INNER JOIN DF_SD b WHERE a.aid = b.aid AND a.DId= b.DId AND a.BM= b.BM AND a.BY = b.BY" 

我轉換爲

val Pr = DF_SD.select("aid","DId","BM","BY","TO").distinct() 
.join(GetRaw,GetRaw.("aid") <=> DF_SD("aid") 
&& GetRaw.("DId") <=> DF_SD("DId") 
&& DF_SD,GetRaw.("BM") <=> DF_SD("BM") 
&& DF_SD,GetRaw.("BY") <=> DF_SD("BY")) 

我的輸出表包含列

"aid","DId","BM","BY","TO","aid","DId","BM","BY" 

任何一個可以糾正我在哪裏做錯

+0

@安吉你應該糾正你的參考碼片段。 – FaigB

回答

1

只需使用SELECT distincts的加盟後:

val Pr = DF_SD.join(GetRaw,Seq("aid","DId","BM","BY")) 
.select("aid","DId","BM","BY","TO").distinct