數據整形和R中的邏輯索引

我有以下（虛設）數據：數據整形和R中的邏輯索引

d <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 5L, 
5L, 5L, 5L, 5L, 5L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 
2L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("apple", "grapefruit", 
"orange", "peach", "pear"), class = "factor"), type = structure(c(2L, 
2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 
1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L), .Label = c("large", 
"small"), class = "factor"), location = structure(c(1L, 2L, 3L, 
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("P1", 
"P2", "P3"), class = "factor"), diameter = c(17.2, 19.1, 18.5, 
23.3, 22.9, 19.4, 11.1, 11.8, 6.8, 3.2, 7.9, 5.6, 8.4, 9.2, 9.7, 
17.1, 19.4, 18.9, 11.8, 10.6, 10.1, 18.8, 17.9, 13.2, 8.5, 8.9, 
7.2, 10.1, 8.7, 6.6)), .Names = c("group", "type", "location", 
"diameter"), class = "data.frame", row.names = c(NA, -30L))

我想創建從該新的數據幀，從「直徑」變量導出比率爲每個級別3個因素：「位置」，「類型」和「組」。

P3.P1.L <- with(d, diameter[group=="pear" & type=="large" & location=="P3"]/diameter[group=="pear" & type=="large" & location=="P1"]) 
P2.P1.L <- with(d, diameter[group=="pear" & type=="large" & location=="P2"]/diameter[group=="pear" & type=="large" & location=="P1"]) 
P3.P1.S <- with(d, diameter[group=="pear" & type=="small" & location=="P3"]/diameter[group=="pear" & type=="small" & location=="P1"]) 
P2.P1.S <- with(d, diameter[group=="pear" & type=="small" & location=="P2"]/diameter[group=="pear" & type=="small" & location=="P1"])

最後data.frame會是這個樣子：

group, type, P2.P1, P3.P1 
pear, large, 1.75, 2.469 
pear, small, 0.613, 1.063 
apple, large, ..., ... 
apple, small, ..., ...

很顯然，我能做到這一點像我上面的說明 - 邏輯索引每個實例中的3個因素的正確水平。問題是，在我的真實數據中，我有大約40個關於「組」因素的等級（儘管在「類型」中仍然只有2個等級）。我想要一個解決方案，使我可以使用邏輯索引與「位置」或許「類型」，然後遍歷「組」的所有級別。例如，像：

with(d, by(d, group, function(x) diameter[type=="large" & location=="P3"]/diameter[type=="large" & location=="P1"]))

但是這並不完全做到我想要什麼（用「組== X」也不行索引）。

一個解決方案將跟蹤每個比率與其「組」和「類型」因子水平的關聯，然後將這些數據放入新數據框中，如上面所需的輸出所示，將是驚人的。任何有關如何解決這個問題的建議都將非常感謝。

來源

2012-02-16 Steve

您可以使用dcast將數據轉換爲更寬的格式。

library(reshape2) 
d <- dcast(d, group + type ~ location)

它是那麼簡單的計算需要的比例，例如：

transform(d, P2.P1=P2/P1, P3.P1=P3/P1)

來源

2012-02-16 03:47:28

那太好了，謝謝。 ......我現在真的要花時間學習哈德利的數據處理軟件包。 – Steve 2012-02-16 04:07:06

數據整形和R中的邏輯索引

回答

相關問題