你可以試着從我的「splitstackshape」包merged.stack
。
假設這是你的起始數據....
mydf <- read.table(
text = "id, file, topic, proportion, topic, proportion
0,file1.txt,0,0.01
1,file2.txt,0,0.01,1,0.03",
header = TRUE, sep = ",", fill = TRUE)
mydf
# id file topic proportion topic.1 proportion.1
# 1 0 file1.txt 0 0.01 NA NA
# 2 1 file2.txt 0 0.01 1 0.03
你就只需要做....
library(splitstackshape)
merged.stack(mydf, var.stubs = c("topic", "proportion"),
sep = "var.stubs")[, .time_1 := NULL][]
# id file topic proportion
# 1: 0 file1.txt 0 0.01
# 2: 0 file1.txt NA NA
# 3: 1 file2.txt 0 0.01
# 4: 1 file2.txt 1 0.03
總結,如果你不想在na.omit
整個事情其中包含NA
值的行。
na.omit(
merged.stack(mydf, var.stubs = c("topic", "proportion"),
sep = "var.stubs")[, .time_1 := NULL])
# id file topic proportion
# 1: 0 file1.txt 0 0.01
# 2: 1 file2.txt 0 0.01
# 3: 1 file2.txt 1 0.03
你的問題不清楚。你是否將數據讀入R?另外,我相信你的意思是一致的列數,而不是行。 –
是的,我正在讀取一個文件,獲得一個具有不同數量列的數據框,並且我想規範化這些數據以獲得固定數量的列來分割每條記錄。 –