如何包括計數的每個字符，而使用itertools.groupby刪除重複項

df= pd.DataFrame(data=all_r_1.to_dataframe().groupby(['user_id'])['type'].sum()).reset_index() 

userid | type 
20  | aab 
21  | ababb

要在type列中的字符串刪除重複的，我有這樣的代碼：

df['type'] = df['type'].apply(lambda x: ''.join(ch for ch, _ in itertools.groupby(x)))

其產生這樣的：

userid | type 
20  | ab 
21  | abab

這是輸入DF：

id | userid | type 
1 | 20  | a 
2 | 20  | a 
3 | 20  | b 
4 | 21  | a 
5 | 21  | b 
6 | 21  | a 
7 | 21  | b 
8 | 21  | b

但是，我想要做的是包括計數的每個字符，同時刪除重複項：

userid | type 
20  | a2b 
21  | abab2

任何想法如何，我可以修改itertools.groupby代碼，還包括計數？

2017-03-07 renakre

試試這個： 'DF [ '型'] = DF [ '型']申請（拉姆達X： ''。 join（ch + len（list（group））for ch，itertools.groupby（x）））' –

@Chris_Rands謝謝！我得到了這個錯誤'TypeError：類型'itertools._grouper'的對象沒有len（）' – renakre

你把'len（list（group））'？（我編輯了這個到我原來的評論） –

itertools.groupby賣場實際組，以便你可以訪問此如下：

df['type'] = df['type'].apply(lambda x: ''.join('{}{}'.format(ch,len(list(group))) for ch, group in itertools.groupby(x)))

2017-03-07 09:17:53

回答