粘貼值

欲樞轉result列df水平創建與一個單獨的行的數據組爲每 region，state，county組合，其中列由year然後city排序。粘貼值

我也想找出新的數據通過region，state和county設置每一行和刪除四個results列之間的空白。下面的代碼完成了所有這些，但我懷疑它不是非常有效。

有沒有辦法做到這一點與reshape2沒有創建每個組的唯一標識符和每組內的編號觀察？有沒有辦法使用apply來代替for循環來從矩陣中去除空白區域？（矩陣的使用方式不同於數學或編程結構。）我意識到這是兩個不同的問題，也許我應該單獨發佈每個問題。

鑑於我可以達到預期的效果，並且只是希望改進代碼，我不知道是否應該發佈此代碼，但我希望能夠學習。感謝您的任何建議。

df <- read.table(text= " 
region state county city year result 
1   1  1  1  1  1 
1   1  1  2  1  2 
1   1  1  1  2  3 
1   1  1  2  2  4 
1   1  2  3  1  4 
1   1  2  4  1  3 
1   1  2  3  2  2 
1   1  2  4  2  1 
1   2  1  1  1  0 
1   2  1  2  1 NA 
1   2  1  1  2  0 
1   2  1  2  2  0 
1   2  2  3  1  2 
1   2  2  4  1  2 
1   2  2  3  2  2 
1   2  2  4  2  2 
2   1  1  1  1  9 
2   1  1  2  1  9 
2   1  1  1  2  8 
2   1  1  2  2  8 
2   1  2  3  1  1 
2   1  2  4  1  0 
2   1  2  3  2  1 
2   1  2  4  2  0 
2   2  1  1  1  2 
2   2  1  2  1  4 
2   2  1  1  2  6 
2   2  1  2  2  8 
2   2  2  3  1  3 
2   2  2  4  1  3 
2   2  2  3  2  2 
2   2  2  4  2  2 
", header=TRUE, na.strings=NA) 

desired.result <- read.table(text= " 
region state county results 
1   1  1  1234 
1   1  2  4321 
1   2  1  0.00 
1   2  2  2222 
2   1  1  9988 
2   1  2  1010 
2   2  1  2468 
2   2  2  3322 
", header=TRUE, colClasses=c('numeric','numeric','numeric','character')) 

# redefine variables for package reshape2 creating a unique id for each 
# region, state, county combination and then number observations in 
# each of those combinations 

library(reshape2) 

id.var <- df$region*100000 + df$state*1000 + df$county 
obsnum <- sequence(rle(id.var)$lengths) 

df2 <- dcast(df, region + state + county ~ obsnum, value.var = "result") 

# remove spaces between columns of results matrix 
# with a for-loop. How can I use apply to do this? 

x <- df2[,4:(4+max(obsnum)-1)] 

# use a dot to represent a missing observation 

x[is.na(x)] = '.' 

x.cat = numeric(nrow(x)) 

for(i in 1:nrow(x)) { 
    x.cat[i] = paste(x[i,], collapse="") 
} 

df3 <- cbind(df2[,1:3],x.cat) 
colnames(df3) <- c("region", "state", "county", "results") 
df3 

df3 == desired.result

編輯：

馬修倫德伯格的下面的答案是優秀的。之後，我意識到我還需要創建一個輸出數據集，其中上面的四個結果列包含數字，有理數，並用空格分隔。所以，我已經發布了一個明顯的方式來做到這一點，這改變了馬修的答案。我不知道這是否是可以接受的協議，但是新的方案似乎與原始文章緊密相關，因此我認爲我不應該發佈新的問題。

來源

2012-12-31 Mark Miller

我想這你想要做什麼：

df$result <- as.character(df$result) 
df$result[is.na(df$result)] <- '.' 


aggregate(result ~ county+state+region, data=df, paste0, collapse='') 

    county state region result 
1  1  1  1 1234 
2  2  1  1 4321 
3  1  2  1 0.00 
4  2  2  1 2222 
5  1  1  2 9988 
6  2  1  2 1010 
7  1  2  2 2468 
8  2  2  2 3322

這依賴於你的數據幀以正確的順序進行排序（你是）。

來源

2012-12-31 23:09:25

謝謝你傑出的答案。後來我意識到我還需要一個輸出數據集，其中四個結果列是數字的，並由空格分隔。我無法修改你的答案，但我靠近了，並在此發佈了代碼。 –

Matthew Lundberg的回答非常好。之後，我意識到我還需要創建一個輸出數據集，其中上面的四個結果列包含數字，有理數，並用空格分隔。所以，在這裏我通過修改Matthew的答案提供了一個明顯的方法來做到這一點。我不知道這是否是可以接受的協議，但是新的方案似乎與原始文章緊密相關，因此我認爲我不應該發佈新的問題。

前兩行是對Matthew答案的修改。

df$result[is.na(df$result)] <- 'NA' 
df2 <- aggregate(result ~ county+state+region, data=df, paste)

然後我指定NA代表缺少觀察和使用apply獲得數字輸出。

df2$result[df2$result=='NA'] = NA 
new.df <- data.frame(df2[,1:3], apply(df2$result,2,as.numeric))

的輸出低於所不同的是我加入0.5到在原崗位示於df每個值音符。

county state region X1 X2 X3 X4 
    1  1  1 1.5 2.5 3.5 4.5 
    2  1  1 4.5 3.5 2.5 1.5 
    1  2  1 0.5 NA 0.5 0.5 
    2  2  1 2.5 2.5 2.5 2.5 
    1  1  2 9.5 9.5 8.5 8.5 
    2  1  2 1.5 0.5 1.5 0.5 
    1  2  2 2.5 4.5 6.5 8.5 
    2  2  2 3.5 3.5 2.5 2.5

來源

2013-01-01 12:01:42

在我原來的職位，我問怎麼刪除列之間的空格使用apply的數據集。由於馬修倫德伯格對我的大問題的回答，這並不是必要的。儘管如此，刪除數據集的列之間的空格是我經常需要做的事情。爲了保持完整性，我在這裏發佈了一個使用paste0和apply這樣做的方法，部分來自Matthew的回答。

爲了從數據中移除所有的空格設置x：

x <- read.table(text= " 
A B C D 
1 1 1 1 
1 1 2 2 
1 NA 1 3 
1 1 2 4 
1 2 1 5 
1 2 NA 6 
1 2 1 7 
1 2 2 8 
", header=TRUE, na.strings=NA) 

# use a dot to represent a missing observation 

x[is.na(x)] = '.' 

y <- as.data.frame(apply(x, 1, function(i) paste0(i, collapse=''))) 
colnames(y) <- 'result' 
y

給出：

下面的代碼刪除只在第二列和第三列之間的空間：

z <- as.data.frame(apply(x[,2:3], 1, function(i) paste0(i, collapse=''))) 

y <- data.frame(x[,1], z, x[,4]) 
colnames(y) <- c('A','BC','D') 
y

給予：

來源

2013-01-01 20:06:21

不需要爲'apply'創建匿名函數。改爲使用'...'參數傳遞給'paste0'。 'apply（x，1，paste0，collapse =''）'， –

回答

相關問題