2017-09-19 29 views
0

頻率表假設一個數據幀如下所列:生成無序數據

A<-c("John","John","James","Brad") 
B<-c("Deb","Deb","Henry","Suzie") 
C<-c("Barry","Beth","Deb","Louise") 
D<-c("Ben","Dory","John","Simon") 
df<-data.frame(A,B,C,D) 
df 
     A  B  C  D 
1 John Deb Barry Ben 
2 John Deb Beth Dory 
3 James Henry Deb John 
4 Brad Suzie Louise Simon 

一個人如何去生成表示在列A & B值的組合的總次數的頻率表中找到在同一行。爲此的輸出將如下所示。

 A  B  n 
1 Brad Suzie  1 
2 James Henry  1 
3 John Deb  3 

我知道使用dplyr的簡單頻率表,但我無法讓它在這種情況下工作。

+0

您是否意味着'庫(dplyr);計數(rbind(DF [1:2],相交(DF [1:2],setNames(DF [4:3],名稱(DF)[1 :2]))),A,B)'相關[this](https://stackoverflow.com/questions/9809166/count-number-of-rows-within-each-group) – akrun

+0

John Deb組合是2 ?不是3,因爲你只需要2列? –

+0

我想根據前兩欄中的信息總結整個表格。所以John和Deb的組合存在三行。 – Morts81

回答

0
df<-data.frame(A = c("John","John","James","Brad"), 
       B = c("Deb","Deb","Henry","Suzie"), 
       C = c("Barry","Beth","Deb","Louise"), 
       D = c("Ben","Dory","John","Simon"), stringsAsFactors = F) 

df$seq <- paste(df$A, df$B, df$C, df$D, sep = ",") 

names <- unique(c(df$A,df$B)) 
pairs <- combn(names, 2) 
finaldf <- data.frame(name1 = NULL, name2 = NULL, count = NULL) 

for(i in 1:ncol(pairs)){ 
    name1 <- pairs[1,i] 
    name2 <- pairs[2,i] 
    count <- length(which(grepl(name1,df$seq) & grepl(name2,df$seq))) 

    finaldf <- rbind(finaldf, data.frame(name1 = name1, name2 = name2, count = count)) 

} 

finaldf 

> finaldf 
name1 name2 count 
1 John James  1 
2 John Brad  0 
3 John Deb  3 
4 John Henry  1 
5 John Suzie  0 
6 James Brad  0 
7 James Deb  1 
8 James Henry  1 
9 James Suzie  0 
10 Brad Deb  0 
11 Brad Henry  0 
12 Brad Suzie  1 
13 Deb Henry  1 
14 Deb Suzie  0 
15 Henry Suzie  0