的R - 的一個因素試圖組數據幀，但捕捉因素

我一直對這個數據處理的過了一天，知道它應該是簡單的發生了......的R - 的一個因素試圖組數據幀，但捕捉因素

我有一個包含4個變量的數據框。 reportID和TestResult之間存在1：1的關係，ReportID和所有其他變量之間存在1：1的關係。我認爲將報告ID重新設置爲一個因素是有道理的，但不確定。

reportID <- c(1000, 1000, 1000, 1001, 1002, 1002) 
TestResult <-c("aa","bb","cc","dd","aa","ee") 
dateSent <- c(as.Date("2017-08-01"),as.Date("2017-08-01"),as.Date("2017-08-01"),as.Date("2017-08-04"),as.Date("2017-08-05"),as.Date("2017-08-05")) 
otherVar1<- c(11,11,11,12,13,13) 
df<- data.frame(reportID,TestResult,dateSent,otherVar1)

我覺得dplyr這裏是正確的工具...

我要的是這樣一個數據幀：

reportID Results dateSent  otherVar1 
1000  3  2017-08-01   11 
1001  1  2017-08-04   12 
1002  2  2017-08-05   13

具體而言，對於每個結果行是太多的信息 - 我想統計每個報告ID記錄結果的次數，並收集數據幀中的其他信息。

編輯/附加註釋 在這個例子中數據我應該表示的事實，即某些數據框架元件具有在名稱空間。在我的現實世界中的問題我的數據是這樣的：「報告ID」 < - C（1000，1000，1000，1001，1002，1002）

在列的名稱空間帶來的問題難以調試。我結束了使用下面建議的答案，但使用傾斜單引號。

Plot1Data <- VariantReport %>% 
group_by(`report id`,`date sent`,`other variable1`) %>% 
summarise(numresults=n())

來源

2017-09-13 A. Mandel

我不是100％肯定我正確地解釋你的要求，但我認爲這會工作

df %>% group_by(reportID,dateSent,otherVar1) %>% summarise(numresults=n())

來源

2017-09-13 17:33:38 simitpatel

我想這就是它！我可以在group_by函數中放入任何數量的變量，如果它們在相同的報告ID中是統一的？ –

如果所有其他變量與您想要了解的主變量都是1：1，那麼這應該是可以接受的。如果你有任何1：很多，但它不會產生預期的結果。 – simitpatel

先生，如果你有一個以上的otherVar，那麼你可能會高興使用group_by_at，並指定所有otherVars非常容易。

library(dplyr);library(magrittr) 

# if you know the columnames string pattern 
    df2 %>% 
    group_by_at(.vars = vars(reportID, dateSent, matches("otherVar"))) %>% 
    summarize(Results = n()) 

# or you prefer range of variables from:to 
    df2 %>% 
    group_by_at(.vars = vars(reportID, dateSent, otherVar1:otherVar1)) %>% 
    summarize(Results = n())

來源

2017-09-13 17:59:05 Gonzo

的R - 的一個因素試圖組數據幀，但捕捉因素

回答

相關問題