2017-01-27 36 views
0

我不能夠正確地加入,並得到結果列,需要加入星火內部聯接,並得到MIN()

SELECT 
t.ad, 
t.DId, 
t.BY, 
t.BM, 
t.cid, 
MIN(p.PS) AS PS 
FROM 
    Tempity t inner join ples p 
    on t.cid = p.cid 
    and p.PType = t.TeO 
    AND p.pto = 'cccc' 
    AND p.cid = 2 
    GROUP BY t.aid 
    ,t.DId 
    ,t.BYear 
    ,t.BM 
    ,t.cid; 
I am converting above sql query as 
     val RS = Tempity.join(DF_LES,Tempity("cid") <=> DF_PLES("cid")&& DF_PLES("clientid") <=> 2 && Tempity("TO") <=> DF_LES("PType") && DF_LES("pto") <=> "cccc" ,"inner").select("aid","DId","BM","BY","TO","cid").groupBy(aid","DId","BM","BY")select("aid","DId","BM","BY","TO","cid").show 

不能找到出去之後獲得列的MIN(),我做錯了 錯誤

org.apache.spark.sql.AnalysisException: Reference 'cid' is ambiguous, could be: cid#4058, cid#13063L.; 

回答

0

使用Tempity("cid")而不是cid,因爲它是不明確的

import org.apache.spark.sql.functions._ //for min() 

val RS = Tempity.join(DF_LES, 
      Tempity("cid") <=> DF_PLES("cid") && 
      DF_PLES("clientid") <=> 2 && 
      Tempity("TO") <=> DF_PLES("PType") && 
      DF_PLES("pto") <=> "cccc", 
     "inner" 
    ) 
    .groupBy("a​id","DId","BM","BY", Tempity("cid"))‌​ 
    .agg(min(DF_PLES("PS"))) 

RS.show() 

另一種方法是你可以使用相同的SQL上SparkSession

val spark: SparkSession = SparkSession.builder.master("local").getOrCreate; 

//create tables from DataFrames 
Tempity.createOrReplaceTempView("Tempity") 
DF_PLES.createOrReplaceTempView("ples") 

import spark.sql 

//Now run the same SQL 

sql(""" 
    SELECT t.ad, t.DId, t.BY, t.BM, t.cid, MIN(p.PS) AS PS 
     FROM Tempity t 
    INNER JOIN ples p 
     ON t.cid = p.cid AND p.PType = t.TeO AND p.pto = 'cccc' AND p.cid = 2 
    GROUP BY t.ad, t.DId, t.BY, t.BM, t.cid 
    """) 
+0

':49:錯誤:價值選擇是不是org.apache.spark.sql.RelationalGroupedDataset' – Anji

+0

成員哪一個拋出的錯誤? 。請檢查多行字符串中刪除的額外報價。 – mrsrinivas

+0

誤差是被拋出'VAL RS = Tempity.join(DF_LES, Tempity( 「CID」)<=> DF_PLES( 「CID」)&& DF_PLES( 「客戶端ID」)<=> 2 && Tempity( 「TO」)<=> DF_PLES (「ptype」)&& DF_PLES(「pto」)<=>「cccc」, 「inner」 ) .groupBy(「a id」,「DId」,「BM」,「BY」,Tempity(「cid 「)) .agg(min(DF_PLES(」PS「))) '我想在DF上執行操作而不是使用sql(」「) – Anji