2014-11-03 41 views
0

欲組列值採用階,例如如何在scala中對相似的列值進行分組?

sunny,hot,high,FALSE,no 
sunny,hot,high,TRUE,no 
overcast,hot,high,FALSE,yes 
rainy,mild,high,FALSE,yes 
rainy,cool,normal,FALSE,yes 
overcast,cool,normal,TRUE,yes 

我想要的結果爲,

對於IST柱.........

IST組

sunny,hot,high,FALSE,no 
sunny,hot,high,TRUE,no 

IIND組

overcast,hot,high,FALSE,yes 
overcast,cool,normal,TRUE,yes 

IIIrd組

rainy,mild,high,FALSE,yes 
rainy,cool,normal,FALSE,yes 

對於IIND柱.........

IST組

hot,high,FALSE,no 
hot,high,TRUE,no 
hot,high,FALSE,yes 

IIND組

cool,normal,FALSE,yes 
cool,normal,TRUE,yes 

一世IIrd組

mild,high,FALSE,yes 

同樣高達倒數第二列中的所有列............

回答

2

使用Seq.groupBy方法。

val data = Seq(("sunny", "hot", "high", "FALSE", "no"), 
    ("sunny", "hot", "high", "TRUE", "no"), 
    ("overcast", "hot", "high", "FALSE", "yes"), 
    ("rainy", "mild", "high", "FALSE", "yes"), 
    ("rainy", "cool", "normal", "FALSE", "yes"), 
    ("overcast", "cool", "normal", "TRUE", "yes")) 

val byFirst = data.groupBy(_._1) 

結果:

Map(
    overcast -> List((overcast,hot,high,FALSE,yes), (overcast,cool,normal,TRUE,yes)), 
    rainy -> List((rainy,mild,high,FALSE,yes), (rainy,cool,normal,FALSE,yes)), 
    sunny -> List((sunny,hot,high,FALSE,no), (sunny,hot,high,TRUE,no))) 
+0

我覺得OP由第二高達最後一列的所有列要組。 – mohit 2014-11-03 06:42:02

+0

然後再次以_._ 2,_._ 3或任何其他方式致電groupBy – 2014-11-03 07:51:33

相關問題