2016-09-23 58 views
1

我是一名新手,想要過濾地圖,並且遇到問題。 我正試圖從.csv文件中刪除頭文件並嘗試將某些記錄歸檔。但由於某種原因我的過濾條件是 無法正常工作。帶地圖的火花過濾功能

val dataWithHeader = sc.textFile("/user/skv/airlines.csv") 
val headerAndRows = dataWithHeader.map(x => x.split(",").map(_.trim) 
val Header = headerAndRows.first  
val data = headerAndRows.filter(_(0) != Header(0)) 

val maps = data.map(x => Header.zip(x).toMap)  
//result looks like //res0:  
// Array[scala.collection.immutable.Map[String,String]] =  
// Array(Map(Code -> "19031", Description -> "Mackey International Inc.: MAC"), 
//  Map(Code -> "19032", Description -> "Munz Northern Airlines Inc.: XY"), 
//now when i am trying to filter the map with the below condition the filter is not working ? 

val result = maps.filter(x => x("Code") != "19031") 

airlines.csv看起來像

Code,Description 
"19031","Mackey International Inc.: MAC" 
"19032","Munz Northern Airlines Inc.: XY" 
"19033","Cochise Airlines Inc.: COC" 
"19034","Golden Gate Airlines Inc.: GSA" 
"19035","Aeromech Inc.: RZZ" 
"19036","Golden West Airlines Co.: GLW" 
"19037","Puerto Rico Intl Airlines: PRN" 
"19038","Air America Inc.: STZ" 
"19039","Swift Aire Lines Inc.: SWT" 

回答

3

你似乎有一對雙引號的太多(因爲你從CSV讀雙引號)。

嘗試

val headerAndRows = dataWithHeader.map(x => x.split(",").map(_.trim.replace("\"", "")) 
+0

感謝拉斐爾...我用的替代取出DOUB le引用.. –

0

更換

val headerAndRows = dataWithHeader.map(x => x.split(",").map(_.trim) 

既然你有你的數據double quote。你可以讓你工作有兩種方式來完成:

  1. 通過更換雙引號(如回答Raphael Roth

  2. 通過與這樣的

    雙引號比較你的價值除去數據雙引號
val result = maps.filter(x => { 
     x("Code") != "\"19031\"" 
    }) 
+0

謝謝p2 ...它幫我解決了它.. –

+0

@satish_venu樂於幫忙,歡迎來到Stack Overflow。如果此答案或任何其他人解決了您的問題,請將其標記爲已接受。 –