我有一個文件,其中包含6列的數據結構並排存儲。這意味着我有n次6列存儲在一個平面文件。
基本上,我想以一種形式重新排列數據,我只有一個data.frame包含6列,但將文件中的所有數據附加到前6列的末尾。如何使用R重新安排數據幀中的數據(組合類似的重複列)
Row 1V1 1V2 1V3 1V4 1V5 1V6 2V1 2V2 2V3 2V4 2V5 2V6 3V1...
1
2
結果應該看起來像移動數據從2V1-2V6到1V1-1V6
Row V1 V2 V3 V4 V5 V6
1-1
1-2
2-1
2-2
結束時,我查閱了一些代碼片段,並可以在數據加載到所有的數據幀矢量。然後我嘗試創建n個總是包含重複數據結構的數據框。然後我嘗試將單個數據框合併到最後一個,但它不起作用。
df<-read.table("test.txt",header = FALSE, sep = ";", skip = 2)
columnmax=as.integer(ncol(df)/6)
dfnew <- vector(mode="list",length=columnmax)
for (i in 1:columnmax) {
start<-((i-1)*6+1)
end<-(i*6)
dfnew[[i]]<-df[,start:end]
}
y <- do.call(rbind, dfnew)
結果:
Error in match.names(clabs, names(xi)) :
names do not match previous names
我用列表模式,因爲我沒有得到它的工作,以數據幀,否則分開。 但現在看來,它使得一個問題成爲可能,因爲「列名」不完全相同。 我還沒有想法如何更改列名稱,因爲它不是R終端中的矩陣,而是一個列表。 我確定必須有一種更簡單的方法來做我想做的事情,但我剛剛開始使用R,並且不熟悉數據類型的許多不同概念。
編輯: DATA
structure(list(V1 = NA, V2 = NA, V3 = NA, V4 = NA, V5 = NA, V6 = NA,
V7 = NA, V8 = NA, V9 = NA, V10 = NA, V11 = NA, V12 = NA,
V13 = structure(1L, .Label = "1,20101E+27", class = "factor"),
V14 = structure(1L, .Label = "05.07.2010 14:50", class = "factor"),
V15 = structure(1L, .Label = "ADMINISTRATOR", class = "factor"),
V16 = 1L, V17 = NA, V18 = NA, V19 = structure(1L, .Label = "1,20101E+27", class = "factor"),
V20 = structure(1L, .Label = "05.07.2010 14:50", class = "factor"),
V21 = structure(1L, .Label = "ADMINISTRATOR", class = "factor"),
V22 = 1L, V23 = NA, V24 = NA, V25 = structure(1L, .Label = "1,20101E+27", class = "factor"),
V26 = structure(1L, .Label = "05.07.2010 14:50", class = "factor"),
V27 = structure(1L, .Label = "ADMINISTRATOR", class = "factor"),
V28 = 1L, V29 = NA, V30 = NA, V31 = structure(1L, .Label = "1,20101E+27", class = "factor"),
V32 = structure(1L, .Label = "05.07.2010 14:50", class = "factor"),
V33 = structure(1L, .Label = "ADMINISTRATOR", class = "factor"),
V34 = 1L, V35 = NA, V36 = NA, V37 = NA, V38 = NA, V39 = NA,
V40 = NA, V41 = NA, V42 = NA, V43 = NA, V44 = NA, V45 = NA,
V46 = NA, V47 = NA, V48 = NA, V49 = NA, V50 = NA, V51 = NA,
V52 = NA, V53 = NA, V54 = NA, V55 = NA, V56 = NA), .Names = c("V1",
"V2", "V3", "V4", "V5", "V6", "V7", "V8", "V9", "V10", "V11",
"V12", "V13", "V14", "V15", "V16", "V17", "V18", "V19", "V20",
"V21", "V22", "V23", "V24", "V25", "V26", "V27", "V28", "V29",
"V30", "V31", "V32", "V33", "V34", "V35", "V36", "V37", "V38",
"V39", "V40", "V41", "V42", "V43", "V44", "V45", "V46", "V47",
"V48", "V49", "V50", "V51", "V52", "V53", "V54", "V55", "V56"
), row.names = 1L, class = "data.frame")
SebM,你可以使用可加載數據更新你的文章嗎?嘗試發佈此結果:dput(head(df,5)) – 2010-10-05 15:42:41
我想它不再必要了,但我試着明天做。只是爲了讓帖子完整,並讓我適應這裏的論壇系統。謝謝你的幫助。 – Sebastian 2010-10-05 16:33:07
我從字面上看你昨天有確切的問題。 'dput()'讓你得到更快的答案,或者爲你的解算器生成示例數據。 :) – 2010-10-05 16:36:25