2017-08-29 26 views
-1

我有一個表是這樣的:建設應急表

df <- data.frame(P1 = c(1,0,0,0,0,0,"A"), 
        P2 = c(0,-2,1,2,1,0,"A"), 
        P3 = c(-1,2,0,2,1,0,"B"), 
        P4 = c(2,0,-1,0,-1,0,"B"), 
        Names = c("G1","G2","G3","G1","G2","G3","Group"), 
        stringsAsFactors = FALSE) 

,這已經成爲

Names P1 P2 P3 P4 
G1  1 0  -1 2 
G2  0 -2 2 0 
G3  0 1  0 -1 
G1  0 2  2 0 
G2  0 1  1 -1 
G3  0 0  0 0 
Group A A  B B 

這裏,AB是分組變量P1, P2, P3, P4

我想建立Ids應急(G1G2 ...),GroupAB)和Var-2,-1,0,1,2)表,例如:

Id Group Var Count 
G1 A  -2  0 
G1 A  -1  0 
G1 A  0  1 
G1 A  1  1 
G1 A  2  0 
G1 B  -2  0 
G1 B  -1  1 
G1 B  0  0 
G1 B  1  0 
G1 B  2  1 
G2 A  -2  1 
G2 A  -1  0 
G2 A  0  1 
... 

有沒有辦法做到它在R中沒有使用大量的循環?

+3

(HTTP【如何使一個偉大的[R重複的例子?]://計算器。 com/questions/5963269) – Sotos

+0

謝謝@索托斯,我加了df – Sosi

+2

我覺得你的輸出與你的'df'不一致:不應該'組'是一個變量?它連續出現...... – mdag02

回答

1

假設你要組P1 & P2列作爲AP3 & P4列作爲B,你可以用data.table -package如下來解決:

library(data.table) 
DT <- melt(melt(setDT(df), 
       measure.vars = list(c(2,3),c(4,5)), 
       value.name = c("A","B")), 
      id = 1, measure.vars = 3:4, variable.name = 'group' 
      )[order(Id,group)][, val2 := value] 

DT[CJ(Id = Id, group = group, value = value, unique = TRUE) 
    , on = .(Id, group, value) 
    ][, .(counts = sum(!is.na(val2))), by = .(Id, group, value)] 

導致:

Id group value counts 
1: G1  A -2  0 
2: G1  A -1  0 
3: G1  A  0  2 
4: G1  A  1  1 
5: G1  A  2  1 
6: G1  B -2  0 
7: G1  B -1  1 
8: G1  B  0  1 
9: G1  B  1  0 
10: G1  B  2  2 
11: G2  A -2  1 
12: G2  A -1  0 
13: G2  A  0  2 
14: G2  A  1  1 
15: G2  A  2  0 
16: G2  B -2  0 
17: G2  B -1  1 
18: G2  B  0  1 
19: G2  B  1  1 
20: G2  B  2  1 
21: G3  A -2  0 
22: G3  A -1  0 
23: G3  A  0  3 
24: G3  A  1  1 
25: G3  A  2  0 
26: G3  B -2  0 
27: G3  B -1  1 
28: G3  B  0  3 
29: G3  B  1  0 
30: G3  B  2  0 

使用的數據

df <- read.table(text="Id  P1 P2 P3 P4 
G1  1 0 -1 2 
G2  0 -2 2  0 
G3  0 1 0  -1 
G1  0 2 2  0 
G2  0 1 1  -1 
G3  0 0 0  0", header=TRUE, stringsAsFactors = FALSE) 

注意,我省略了「Group'行,因爲你的意見,這些都只是爲了表示對羣體P1其中指出 - P4列應屬於。

+0

的確,非常感謝! – Sosi

1

隨着

library(tidyverse) 

df <- read.table(text="Id  P1 P2 P3 P4 
G1  1 0 -1 2 
G2  0 -2 2  0 
G3  0 1 0  -1 
G1  0 2 2  0 
G2  0 1 1  -1 
G3  0 0 0  0", header=TRUE, stringsAsFactors = FALSE) 

我們重塑表和group重新編碼P*變量。 然後我們計算並完成遺失的案例。導致:

df %>% 
    gather(P1, P2, P3, P4, key = "p", value = "v") %>% 
    mutate(group = ifelse(p %in% c("P1", "P2"), "A", "B")) %>% 
    group_by(Id, group, v) %>% 
    summarise(Count = n()) %>% 
    ungroup() %>% 
    complete(Id, group, v, fill = list("Count" = 0)) 

如果你不需要輸出中的所有組合,只需使用:

df %>% 
    gather(P1, P2, P3, P4, key = "p", value = "v") %>% 
    mutate(group = ifelse(p %in% c("P1", "P2"), "A", "B")) %>% 
    group_by(Id, group, v) %>% 
    summarise(Count = n()) 

# A tibble: 17 x 4 
# Groups: Id, group [?] 
     Id group v  Count 
     <chr> <chr> <int> <int> 
1 G1  A  0  2 
2 G1  A  1  1 
3 G1  A  2  1 
4 G1  B -1  1 
5 G1  B  0  1 
6 G1  B  2  2 
7 G2  A -2  1 
8 G2  A  0  2 
9 G2  A  1  1 
10 G2  B -1  1 
11 G2  B  0  1 
12 G2  B  1  1 
13 G2  B  2  1 
14 G3  A  0  3 
15 G3  A  1  1 
16 G3  B -1  1 
17 G3  B  0  3