2
這裏有一個簡單的數據幀:pandas dataframe aggregate - 爲什麼它會返回列名稱?
Acid Balance_1 CustID Balance_2
0 1 0.082627 1 NaN
1 2 0.397579 1 0.459942
2 3 0.201596 2 0.596573
3 4 0.616448 3 0.705697
4 5 0.844865 3 0.483279
5 6 NaN 4 0.360260
我一直在試圖玩弄聚合函數,通過客戶ID分組後。
groupby_obj = time_series.groupby(["CustID"])
df = groupeby_obj.agg(set)
這將返回
Acid \
CustID
1 set([Balance_1, Balance_2, Acid, CustID])
2 set([Balance_1, Balance_2, Acid, CustID])
3 set([Balance_1, Balance_2, Acid, CustID])
4 set([Balance_1, Balance_2, Acid, CustID])
Balance_1 \
CustID
1 set([Balance_1, Balance_2, Acid, CustID])
2 set([Balance_1, Balance_2, Acid, CustID])
3 set([Balance_1, Balance_2, Acid, CustID])
4 set([Balance_1, Balance_2, Acid, CustID])
Balance_2
CustID
1 set([Balance_1, Balance_2, Acid, CustID])
2 set([Balance_1, Balance_2, Acid, CustID])
3 set([Balance_1, Balance_2, Acid, CustID])
4 set([Balance_1, Balance_2, Acid, CustID])
什麼,而不是我想這可能會做:
Acid Balance_1 Balance_2
CustID
1 set([1,2]) set([0.082627, 0.397579]) set([NaN, 0.459942])
etc for the other CustIDs...
爲什麼總填充數據幀與集合中的所有列標題?
感謝, 安妮
謝謝傑夫!我並不是真的想要完成任何事情,只是爲了提高我對大熊貓工作方式的理解而努力...... – Anne