2016-03-04 15 views
1

假設我試圖映射total_cases_age到cases_by_age其中dataframes是:地圖值

results_grouped_age = results_grouped[['Make', 'age', 'Test Result', 'Number of Cases']].copy() 
    cases_by_age = results_grouped_age[['Make','age','Test Result','Number of Cases']].groupby(['Make','age','Test Result']).sum().reset_index() 
    total_cases_age = cases_by_age.groupby(['Make','age'])['Number of Cases'].sum() 

不過,而我通常會做:

total_cases_age的
cases_by_age['Total Cases'] = cases_by_age['age'].map(total_cases_age) 

指數實際上是爲'make and age'的組合,這實際上是我想要做的。爲了更容易理解我的問題,假設我有表cases_by_age」

 Make  age  Test Result  Number of Cases 
0 ALFA ROMEO 0-3   ABA    1 
1 ALFA ROMEO 0-3   ABR    NaN 
2 ALFA ROMEO 0-3   F    45 
3 ALFA ROMEO 0-3   P    268 
4 ALFA ROMEO 0-3   PRS    21 
5 ALFA ROMEO 3-5   ABA    NaN 
6 ALFA ROMEO 3-5   ABR    NaN 
7 ALFA ROMEO 3-5   F    159 
8 ALFA ROMEO 3-5   P    720 

而最終的結果應該是這樣的:對於品牌和年齡

 Make  age  Test Result  Number of Cases  Total Cases by Age 
0 ALFA ROMEO 0-3   ABA    1     335 
1 ALFA ROMEO 0-3   ABR    NaN     335 
2 ALFA ROMEO 0-3   F    45     335 
3 ALFA ROMEO 0-3   P    268     335 
4 ALFA ROMEO 0-3   PRS    21     335 
5 ALFA ROMEO 3-5   ABA    NaN     879 
6 ALFA ROMEO 3-5   ABR    NaN     879 
7 ALFA ROMEO 3-5   F    159     879 
8 ALFA ROMEO 3-5   P    720     879 

等等任何幫助將將不勝感激

回答

1

你可以做一個groupby - sum,其次是左 - merge

pd.merge(
    df, 
    df['Number of Cases'].groupby(df['age']).sum().reset_index().rename(
     columns={'Number of Cases': 'Total Cases by Age'}), 
    how='left') 

假設你開始了與

df = pd.DataFrame({ 
    'Make': ['ALPHA ROMEO'] * 3, 
    'age': ['0-3', '0-3', '3-5'], 
    'Number of Cases': [1, 10, 2] 
    }) 
>>> df 
    Make Number of Cases age 
0 ALPHA ROMEO 1 0-3 
1 ALPHA ROMEO 10 0-3 
2 ALPHA ROMEO 2 3-5 

然後groupby - sum給出:

>>> df['Number of Cases'].groupby(df['age']).sum().reset_index().rename(
    columns={'Number of Cases': 'Total Cases by Age'}) 
    age Total Cases by Age 
0 0-3 11 
1 3-5 2 

而且該組合給出了:

>>> pd.merge(
    df, 
    df['Number of Cases'].groupby(df['age']).sum().reset_index().rename(
     columns={'Number of Cases': 'Total Cases by Age'}), 
    how='left') 
    Make Number of Cases age Total Cases by Age 
0 ALPHA ROMEO 1 0-3 11 
1 ALPHA ROMEO 10 0-3 11 
2 ALPHA ROMEO 2 3-5 2 
+0

謝謝您的回答,但這裏的想法是通過品牌和車齡來概括,那麼在你的例子GROUPBY森說,必須找到與特定的所有車輛的總和年齡並將此值映射到具有此製作和年齡的所有行旁邊(無論測試結果如何) –

+0

實際上,我從來沒有想過,我嘗試將它們分組,然後事實證明沒問題。謝謝! –