我使用Python3和pandas版本'0.19.2'。熊貓在字符串列上滾動總和
我有一個熊貓DF如下:
chat_id line
1 'Hi.'
1 'Hi, how are you?.'
1 'I'm well, thanks.'
2 'Is it going to rain?.'
2 'No, I don't think so.'
我想組由「chat_id」,然後做一些像「線」滾動總和得到如下:
chat_id line conversation
1 'Hi.' 'Hi.'
1 'Hi, how are you?.' 'Hi. Hi, how are you?.'
1 'I'm well, thanks.' 'Hi. Hi, how are you?. I'm well, thanks.'
2 'Is it going to rain?.' 'Is it going to rain?.'
2 'No, I don't think so.' 'Is it going to rain?. No, I don't think so.'
我相信df.groupby('chat_id')['line']。cumsum()只適用於數字列。
我也試圖df.groupby(由= [「chat_id」],as_index =假)「行」]。應用(列表)來獲得完整的會話中的所有行的列表,但後來我無法弄清楚如何解開該列表以創建「滾動總和」式對話欄。
有趣。如果您在Series上調用'cumsum',但在groupby對象上調用時會引發錯誤。 – ayhan