2012-07-30 34 views
5

假設我們有看起來像重塑一個數據幀---改變行列

set.seed(7302012) 

county   <- rep(letters[1:4], each=2) 
state   <- rep(LETTERS[1], times=8) 
industry  <- rep(c("construction", "manufacturing"), 4) 
employment  <- round(rnorm(8, 100, 50), 0) 
establishments <- round(rnorm(8, 20, 5), 0) 

data <- data.frame(state, county, industry, employment, establishments) 

    state county  industry employment establishments 
1  A  a construction  146    19 
2  A  a manufacturing  110    20 
3  A  b construction  121    10 
4  A  b manufacturing   90    27 
5  A  c construction  197    18 
6  A  c manufacturing   73    29 
7  A  d construction   98    30 
8  A  d manufacturing  102    19 

我們想重塑這個數據幀,使每一行代表一個(狀態)縣,而不是縣產業,列construction.employment,construction.establishments,和類似的製造版本。什麼是有效的方法來做到這一點?

一種方法是子集

construction <- data[data$industry == "construction", ] 
names(construction)[4:5] <- c("construction.employment", "construction.establishments") 

同樣地,對於製造,然後做一個合併。如果只有兩個行業,這並不是那麼糟糕,但想象一下有14個行業;這個過程會變得單調乏味(儘管通過在industry的級別上使用for循環來減少這個過程)。

還有其他想法嗎?

回答

7

這可以在基礎R重塑完成,如果我正確地理解你的問題:

reshape(data, direction="wide", idvar=c("state", "county"), timevar="industry") 
# state county employment.construction establishments.construction 
# 1  A  a      146       19 
# 3  A  b      121       10 
# 5  A  c      197       18 
# 7  A  d      98       30 
# employment.manufacturing establishments.manufacturing 
# 1      110       20 
# 3      90       27 
# 5      73       29 
# 7      102       19 
4

而且使用重塑包:

library(reshape) 
m <- reshape::melt(data) 
cast(m, state + county~...) 

產量:

> cast(m, state + county~...) 
    state county construction_employment construction_establishments manufacturing_employment manufacturing_establishments 
1  A  a      146       19      110       20 
2  A  b      121       10      90       27 
3  A  c      197       18      73       29 
4  A  d      98       30      102       19 

我親自使用基礎重塑,所以我可能應該用reshape2(韋翰)顯示這個,但忘記了有一個reshape2包。稍有不同:

library(reshape2) 
m <- reshape2::melt(data) 
dcast(m, state + county~...) 
+0

啊,好的,我用'.'代替'...',所以它不工作。謝謝! – Charlie 2012-07-30 17:06:41