火花+蟒蛇+過濾問題

感激，如果有人可以在下面的代碼片段問題提供一些線索現在火花+蟒蛇+過濾問題

lineStr= sc.textFile("/input/words.txt") 
print (lineStr.collect()) 
['this file is created to count the no of texts', 'other wise i am just doing fine', 'lets see the output is there'] 

wc = lineStr.flatMap(lambda l: l.split(" ")).map(lambda x: (x,1)).reduceByKey(lambda w,c: w+c) 
print (wc.glom().collect()) 
[[('this', 1), ('there', 1), ('i', 1), ('texts', 1), ('just', 1), ('fine', 1), ('is', 2), ('other', 1), ('created', 1), ('count', 1), ('of', 1), ('am', 1), ('no', 1), ('output', 1)], [('lets', 1), ('see', 1), ('the', 2), ('file', 1), ('doing', 1), ('wise', 1), ('to', 1)]]

當我試圖篩選上述數據計數值設定超過1以下使用，我是收到錯誤

s = wc.filter(lambda a,b:b>1) 
print (s.collect())

error : vs = list(itertools.islice(iterator, batch))

TypeError:() missing 1 required positional argument: 'b'

來源

2017-10-28 Suraj

無法解壓縮在lambda功能的元組，lambda a, b:意味着一個函數有兩個參數，即需要一個元組作爲argume不是一個函數NT：

一個簡單的解決方法是捕獲與一個參數的元素，然後使用索引來訪問所述第二元件在所述元組：

wc.filter(lambda t: t[1] > 1).collect() 
# [('is', 2), ('the', 2)]

來源

2017-10-28 15:59:33 Psidom

由於用於說明後面以及邏輯。它運行良好。 – Suraj

@Suraj，如果你認爲它回答了你的問題，請考慮[接受]（http://meta.stackexchange.com/a/5235）答案 – MaxU

火花+蟒蛇+過濾問題

回答

相關問題