如何在Pandas DataFrame中使用apply.join時獲得不同的值

我將一個數據集中的不同值連接到Pandas Dataframe中的一列中，但是存在大量重複，如何在不刪除任何行的情況下襬脫它們？：例如：如何在Pandas DataFrame中使用apply.join時獲得不同的值

newCol 
------ 
123,456,129,123,123 
237,438,365,432,438

使用df.newCol.drop_duplicates（）刪除整個行，但我希望得到的結果是：

newCol 
------ 
123,456,129 
237,438,365,432

...

謝謝

來源

2017-02-22 faranak777

你需要先split，適用set然後join：

df.newCol = df.newCol.apply(lambda x: ','.join(set(str(x).split(',')))) 
print (df) 
      newCol 
0  129,123,456 
1 432,365,438,237

但是你可以在join前申請set：

print (df) 
    0 1 2 3 4 
0 123 456 129 123 123 
1 237 438 365 432 438 

df = df.apply(lambda x: ','.join(set(x.astype(str))), axis=1) 
print (df) 
0  129,123,456 
1 432,365,438,237 
dtype: object

或者unique：

df = df.apply(lambda x: ','.join((x.astype(str)).unique()), axis=1) 
print (df) 
0  123,456,129 
1 237,438,365,432 
dtype: object

來源

2017-02-22 06:24:54 jezrael

謝謝你的解決方案，我忘了提及數據是整數和字符串的組合，當我使用你的解決方案時，我得到這個錯誤：「AttributeError：'float'object has no attribute'split'」 ...請原諒我的基本問題我是新來的python – faranak777

請檢查編輯的nswer。 – jezrael

現在我得到這個錯誤：AttributeError：'str'對象沒有屬性'astype' – faranak777

如何在Pandas DataFrame中使用apply.join時獲得不同的值

回答

相關問題