假設你開始了這個數據:
df = pd.DataFrame({'ID': ('STRSUB BOTDWG'.split())*4,
'Days Late': [60, 60, 50, 50, 20, 20, 10, 10],
'quantity': [56, 20, 60, 67, 74, 87, 40, 34]})
# Days Late ID quantity
# 0 60 STRSUB 56
# 1 60 BOTDWG 20
# 2 50 STRSUB 60
# 3 50 BOTDWG 67
# 4 20 STRSUB 74
# 5 20 BOTDWG 87
# 6 10 STRSUB 40
# 7 10 BOTDWG 34
然後您可以使用pd.cut
找到狀態類別。請注意,在默認情況下,pd.cut
拆分系列df['Days Late']
成是半開間隔,(-1, 14], (14, 35], (35, 56], (56, 365]
類:
df['status'] = pd.cut(df['Days Late'], bins=[-1, 14, 35, 56, 365], labels=False)
labels = np.array('White Yellow Amber Red'.split())
df['status'] = labels[df['status']]
del df['Days Late']
print(df)
# ID quantity status
# 0 STRSUB 56 Red
# 1 BOTDWG 20 Red
# 2 STRSUB 60 Amber
# 3 BOTDWG 67 Amber
# 4 STRSUB 74 Yellow
# 5 BOTDWG 87 Yellow
# 6 STRSUB 40 White
# 7 BOTDWG 34 White
現在使用pivot
獲得數據框所需形式:
df = df.pivot(index='ID', columns='status', values='quantity')
和使用reindex
以獲得所需的行和列順序:
df = df.reindex(columns=labels[::-1], index=df.index[::-1])
因此,
import numpy as np
import pandas as pd
df = pd.DataFrame({'ID': ('STRSUB BOTDWG'.split())*4,
'Days Late': [60, 60, 50, 50, 20, 20, 10, 10],
'quantity': [56, 20, 60, 67, 74, 87, 40, 34]})
df['status'] = pd.cut(df['Days Late'], bins=[-1, 14, 35, 56, 365], labels=False)
labels = np.array('White Yellow Amber Red'.split())
df['status'] = labels[df['status']]
del df['Days Late']
df = df.pivot(index='ID', columns='status', values='quantity')
df = df.reindex(columns=labels[::-1], index=df.index[::-1])
print(df)
產生
Red Amber Yellow White
ID
STRSUB 56 60 74 40
BOTDWG 20 67 87 34
非常感謝您的支持,我認爲這將有助於我在日常工作中通過PANDAS實現很多。還要感謝mtadd,我注意到你也更新了你的答案(它是令人滿意的)。 – PrestonDocks 2013-05-03 18:57:50