2016-06-21 13 views
1

我想列表一些小字符向量的值,並將製表結果附加到字符串。對於下面的可重複的例子,我所需的輸出會是這個樣子:R - 列表字符向量 - 自定義輸出

states     responsible 
1  KS    Joe(2);Suzie(3) 
2  MO      Bob(4) 
3  CO Suzie(1);Bob(2);Ralph(3) 
4  NE      Joe(1) 
5  MT   Suzie(3);Ralph(1) 

這裏的示例數據:

states <- c("KS", "MO", "CO", "NE", "MT") 
responsible <- list(c("Joe", "Joe", "Suzie", "Suzie", "Suzie"), c("Bob", "Bob", "Bob", "Bob"), c("Suzie", "Bob", "Ralph", "Ralph", "Bob", "Ralph"), "Joe", c("Suzie", "Ralph", "Suzie", "Suzie")) 

df <- as.data.frame(cbind(states, responsible)) 

#Tabulating using table() 
resp.tab <- lapply(responsible, table) 

#Is there a way I can do tabulation without converting to factors? 
# OR 
#Is there a way to access the factor label and value, then paste them together? 
+0

'table'還與'character'載體。 – akrun

回答

2

我們可以使用data.table。創建一個data.table通過複製'州''lengths'負責'和unlist'負責'。

library(data.table) 
dt1 <- data.table(states= rep(states, lengths(responsible)), 
       responsible=unlist(responsible)) 

由「州」,和分組「負責」,我們得到的頻率(.N),然後通過「州」分組,我們paste的「負責」和「N」列和collapse行屬於同樣的'國家'。

dt1[, .N, .(states, responsible) 
    ][, .(responsible = paste(paste0(responsible, 
        "(", N, ")"), collapse=";")) ,.(states)] 
# states    responsible 
#1:  KS   Joe(2);Suzie(3) 
#2:  MO     Bob(4) 
#3:  CO Suzie(1);Bob(2);Ralph(3) 
#4:  NE     Joe(1) 
#5:  MT  Suzie(3);Ralph(1) 

或類似的選項與dplyr/tidyr

library(dplyr) 
library(tidyr) 
tbl_df(dt1) %>% 
    group_by(states, responsible) %>% 
    tally() %>% 
    unite(responsible, responsible, n, sep="(") %>% 
    group_by(states) %>% 
    summarise(responsible = paste(paste0(responsible, ")"), collapse=";"))