2017-08-25 86 views
1

我想從Hive獲取數據,例如:如果一列值在列表中,則從Hive中選擇數據。從Hive中選擇數據列表中的列值

示例數據在蜂巢表是:

Col1 | Col2 | Col3 
-------+--------------- 
Joe | 32 | Place-1 
Nancy | 28 | Place-2 
Shalyn | 35 | Place-1 
Andy | 20 | Place-3 

我查詢蜂巢表:

val name = List("Sherley","Joe","Shalyan","Dan") 
var dataFromHive = sqlCon.sql("select Col1,Col2,Col3 from default.NameInfo where Col1 in (${name})") 

我知道我的查詢是錯誤的,因爲它的投擲的錯誤。但我無法正確更換where Col1 in (${name})

+0

什麼?請參閱https://stackoverflow.com/questions/40218473/spark-sql-in-clause/40218776#40218776 –

回答

0

更好的主意是將name轉換爲DataFrame並與dataFromHive連接。內部聯接與僅篩選相交的數據相同。

val nameDf = List("Sherley","Joe","Shalyan","Dan").toDF("Col1") 
var dataFromHive = sqlCon.table("default.NameInfo").join(nameDf, "Col1").select("Col1", "Col2", "Col3") 

嘗試使用DataFrame API。它會使代碼易於閱讀。

0

轉換您的清單字符串(以適當的格式在蜂巢查詢中使用)有關使用數據幀API

val name = List("Sherley","Joe","Shalyan","Dan") 
val name_string = name.mkString("('","','", "')") 
//name_string: String = ('Sherley','Joe','Shalyan','Dan') 

var dataFromHive = sqlCon.sql("select Col1,Col2,Col3 from default.NameInfo where Col1 in " + name_string)