2017-02-02 33 views
0
#Generate some data 
set.seed(1234) 
rows = 100 
created_data <- data.frame(index = 1:rows, 
          catsA = sample((letters[1:5]),rows,replace=T), 
          valueA = round(rnorm(rows),3)) 

使用dplyr創建一個計數類別和順序。基於累積頻率摺疊dplyr tibble的行

library(dplyr) 

count_of_cat <- created_data %>% 
    group_by(catsA) %>% 
    summarise(rowcount = n()) %>% 
    ungroup %>% 
    arrange(-rowcount) %>% 
    mutate(rel.freq = round(rowcount/sum(rowcount),3)) %>% 
    mutate(cum.freq = cumsum(rel.freq)) 

輸出

catsA rowcount rel.freq cum.freq 
1  b  26  0.26  0.26 
2  a  25  0.25  0.51 
3  c  17  0.17  0.68 
4  d  17  0.17  0.85 
5  e  15  0.15  1.00 

是否有彙總行後說的好辦法cum.freq> 0.50

所需的輸出

catsA rowcount rel.freq cum.freq 
1  b  26  0.26  0.26 
2  a  25  0.25  0.51 
3  new  49  0.49  1.00 

回答