如何爲在Pandas中具有重複值的行返回單行

我希望快速完成此操作，而不是逐行進行，因爲它是一個相當大的文件。我找不到任何大熊貓，雖然pivot_table似乎是相當接近......這是我有：如何爲在Pandas中具有重複值的行返回單行

A B 
0 Tree 
0 Leaves 
0 Buds 
1 Ocean 
1 Blue

而我想：

A B 
0 Tree ; Leaves ; Buds 
1 Ocean ; Blue

來源

2014-10-02 Sara

在Python中，你可以通過使用some_delimiter.join(things_you_want_to_join)，例如加入的東西','.join("abc") == 'a,b,c'。我們可以應用到B列上A分組後：

>>> df.groupby("A")["B"].apply(' ; '.join) 
A 
0 Tree ; Leaves ; Buds 
1   Ocean ; Blue 
Name: B, dtype: object

，然後讓B回爲一個名字：

>>> df.groupby("A")["B"].apply(' ; '.join).reset_index() 
    A      B 
0 0 Tree ; Leaves ; Buds 
1 1   Ocean ; Blue

來源

2014-10-02 15:37:48 DSM

該死您更清晰的代碼http://www.youtube.com/看？v = XvuM3DjvYf0 +1;） – EdChum 2014-10-02 15:39:41

加上你的代碼更快647us與1.24ms :( – EdChum 2014-10-02 15:43:05

工程很好，謝謝！ – Sara 2014-10-03 14:23:27

我們可以在執行GROUPBY 'A'然後應用一個函數（在這種情況下爲lambda），我們將所需的分隔符;與B值的列表理解結合起來。

如果你想恢復B列，你可以叫reset_index()：

In [238]: 

gp = df.groupby('A') 
gp.apply(lambda x: ' ; '.join([t for t in list(x['B'])])).reset_index() 
Out[238]: 
    A      0 
0 0 Tree ; Leaves ; Buds 
1 1   Ocean ; Blue

來源

2014-10-02 15:33:43 EdChum

如何爲在Pandas中具有重複值的行返回單行

回答

相關問題