2017-10-11 58 views
0

我有一組200個鼠標ID和一組基因表達值列表,但每個小鼠有相同基因的多個實例。我希望每隻小鼠僅列出一次基因,並且其值等於所有先前值的總和。在合併值的同時在ID列中合併因子

例如這樣的數據:

mouse_number  value gene 
1   64 2.00000 Lypla1 
2   65 1.00000 Lypla1 
3   64 7.00000 Lypla1 
4   65 3.00000 Lypla1 
7   64 4.00000 Pck1 
8   65 2.00000 Pck1 
9   64 1.00000 Pck1 
10   65 5.00000 Pck1 

應該是:

mouse_number  value gene 
1   64 9.00000 Lypla1 
2   65 4.00000 Lypla1 
3   64 5.00000 Pck1 
4   65 7.00000 Pck1 

請幫助,謝謝!

回答

0

您可以使用aggregate

df <- data.frame(
    mouse_number = c(64, 65, 64, 65, 64, 65, 64, 65), 
    value = c(2.0, 1.0, 7.0, 3.0, 4.0, 2.0, 1.0, 5.0), 
    gene = c("Lypla1", "Lypla1", "Lypla1", "Lypla1", "Pck1", "Pck1", "Pck1", "Pck1")); 
df.collapsed <- aggregate(value ~ mouse_number + gene, FUN = sum, data = df); 

df.collapsed; 
# mouse_number gene value 
#1   64 Lypla1  9 
#2   65 Lypla1  4 
#3   64 Pck1  5 
#4   65 Pck1  7