用pandas dataframe中的函數列表創建列表

我想創建一個新的熊貓列，方法是在另一列中的單詞列表中運行單詞詞幹功能。我可以通過使用apply和lambda來標記一個字符串，但我無法弄清楚如何將這個外推到在單詞列表上運行的情況。用pandas dataframe中的函數列表創建列表

test = {'Statement' : ['congratulations on the future','call the mechanic','more text'], 'Other' : [2,3,4]} 
df = pd.DataFrame(test) 
df['tokenized'] = df.apply (lambda row: nltk.word_tokenize(row['Statement']), axis=1)

我知道我可以用一個嵌套循環解決這個問題，但似乎效率低下，導致SettingWithCopyWarning：

df['stems'] = '' 
for x in range(len(df)): 
    print(len(df['tokenized'][x])) 
    df['stems'][x] = row_stems=[] 
    for y in range(len(df['tokenized'][x])): 
     print(df['tokenized'][x][y]) 
     row_stems.append(stemmer.stem(df['tokenized'][x][y]))

是不是有更好的辦法來做到這一點？

編輯：

這裏的結果應該是什麼樣子的例子：

Other  Statement      tokenized        stems 
0 2   congratulations on the future [congratulations, on, the, future] [congratul, on, the, futur] 
1 3   call the mechanic    [call, the, mechanic]     [call, the, mechan] 
2 4   more text      [more, text]       [more, text]

來源

2017-02-25 jss367

難道你的結果應該是什麼樣子的例子編輯？ –

無需運行一個循環，確實如此。至少不是一個明確的循環。列表理解可以正常工作。

假設你使用波特詞幹ps：

df['stems'] = df['tokenized'].apply(lambda words: 
            [ps.stem(word) for word in words])

來源

2017-02-25 05:27:32 DyZ

用pandas dataframe中的函數列表創建列表

回答

相關問題