從數據幀

我有一個大的數據幀，即時通訊與工作中提取重複行，前幾行如下：從數據幀

 Assay Genotype Sample Result 
1  001  G   1   0 
2  001  A   2   1 
3  001  G   3   0 
4  001  NA  1   NA 
5  002  T   1   0 
6  002  G   2   1 
7  002  T   2   0 
8  002  T   4   0 
9  003  NA  1   NA

我總共將有2000個樣品和168個測定爲合作每個樣品。

我喜歡用相同的Assay和Sample來提取我有多個條目的行。我希望生成的數據位於包含所有重複條目的數據框中，按照重複條件彼此相鄰排序。從結果上面的例子是這樣的：

 Assay Genotype Sample Result 
1  001  G   1   0 
4  001  NA  1   NA 
6  002  G   2   1 
7  002  T   2   0

來源

2011-10-19 Sam Globus

演示數據，便於裝載：

df <- structure(list(Assay = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L), Genotype = structure(c(2L, 1L, 2L, NA, 3L, 2L, 3L, 3L, NA), .Label = c("A", "G", "T"), class = "factor"), Sample = c(1L, 2L, 3L, 1L, 1L, 2L, 2L, 4L, 1L), Result = c(0L, 1L, 0L, NA, 0L, 1L, 0L, 0L, NA)), .Names = c("Assay", "Genotype", "Sample", "Result"), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9"))

你可以很容易地duplicated得到dupicated分析/樣品對：

vars <- c('Assay', 'Sample') 
dup <- df[duplicated(x[, vars]), vars]

產生於：

> dup 
    Assay Sample 
4  1  1 
7  2  2

需要簡單merge所需結果：

> merge(dup, df) 
    Assay Sample Genotype Result 
1  1  1  <NA>  NA 
2  1  1  G  0 
3  2  2  G  1 
4  2  2  T  0

來源

2011-10-19 19:19:41 daroczig

回答

相關問題