2017-05-29 33 views
1

分一組的值我有數據幀像下面如何與休息dplyr

fund_name  Industry  quantity  month 
ABC    IT   20   201704 
ABC    IT   20   201704 
ABC    Industrials 30   201704 
ABC    Auto   40   201704 
ABC    Pharma  50   201704 
DEF    IT   20   201704 
DEF    Auto   35   201704 
DEF    Auto   35   201704 
DEF    Pharma  40   201704 

我想計算的數量的百分比行業組成。 例如對於基金ABC,IT行業貢獻40 /(40 + 30 + 20 + 50)= 0.28,即月份爲28%201704

期望的數據框應該看起來像這樣。

fund_name  Industry  quantity     month 
ABC    IT   40/(40+30+20+50)   201704 
ABC    Industrials 30/(40+30+20+50)   201704 
ABC    Auto   40/(40+30+20+50)   201704 
ABC    Pharma  50/(40+30+20+50)   201704 
DEF    IT   20/(20+70+40)    201704 
DEF    Auto   70/(20+70+40)    201704 
DEF    Pharma  40/(20+70+40)    201704 

我在下面試過,但是它只給出了數量的總和。

final_MF %>% 
    group_by(fund_names,Month,Industry) %>% 
    summarise(total_quant = sum(Quantity)) %>% 
    as.data.frame() 

我該如何在dplyr中實現這個功能?

+1

你的意思是GROUPBY'fund_name'? –

+1

group by fund_name,Industry and month – Neil

+0

如果您正在按照您顯示的示例分類fund_name,Industry和month,則它只給出1.我不遵循分母 – akrun

回答

1

一個的幾種方法:

df <- read.table(header=TRUE, text="fund_name  Industry  quantity  month 
ABC    IT   20   201704 
ABC    Industrials 30   201704 
ABC    Auto   40   201704 
ABC    Pharma  50   201704 
DEF    IT   20   201704 
DEF    Auto   35   201704 
DEF    Pharma  40   201704") 
df 

library(dplyr) 
want<-select(
    mutate(
    left_join(df, 
      df %>% 
        group_by(fund_name) %>% 
        summarize(quantity_sum=sum(quantity)), 
       by="fund_name"), 
    quantity=quantity/quantity_sum), 
    -quantity_sum) 
want 
0

繼R代碼裏面有我,我一直在尋找

industry_composition <- final_reliance_MF %>% 
    group_by(fund_names,Industry,Month) %>% 
    summarise(total_quant = sum(Quantity)) %>% 
    group_by(fund_names,Month) %>% 
    mutate(perc = (total_quant/sum(total_quant))*100) %>% 
    as.data.frame()