以下是使用reshape2
庫的方法。
machine1.workingTime <- 1:10
machine2.workingTime <- 21:30
machine1.producedItems <- 101:110
machine2.producedItems <- 201:210
date <- c("2017-01-01","2017-01-02","2017-01-03","2017-01-04","2017-01-05","2017-01-06",
"2017-01-07","2017-01-08","2017-01-09","2017-01-10")
theData <- data.frame(date,
machine1.producedItems,
machine1.workingTime,
machine2.producedItems,
machine2.workingTime
)
library(reshape2)
meltedData <- melt(theData,measure.vars=2:5)
meltedData$variable <- as.character(meltedData$variable)
# now, extract machine numbers and variable names
variableNames <- strsplit(as.character(meltedData$variable),"[.]")
# token after the . is variable name
meltedData$columnName <- unlist(lapply(variableNames,function(x) x[2]))
# since all variables start with word 'machine' we can set chars 8+ as ID
meltedData$machineId <- as.numeric(unlist(lapply(variableNames,function(x) y <- substr(x[1],8,nchar(x[1])))))
theResult <- dcast(meltedData,machineId + date ~ columnName,value.var="value")
head(theResult)
的結果是:
> head(theResult)
machineId date producedItems workingTime
1 1 2017-01-01 101 1
2 1 2017-01-02 102 2
3 1 2017-01-03 103 3
4 1 2017-01-04 104 4
5 1 2017-01-05 105 5
6 1 2017-01-06 106 6
>
UPDATE(02Dec2017):迴應的意見,如果沒有其它標識符來唯一區分的多個行對一臺機器,一個可以使用的聚合功能導致每臺機器觀察一次。
theResult <- dcast(meltedData,machineId ~ columnName,
fun.aggregate=mean,value.var="value")
head(theResult)
的結果如下。
> head(theResult)
machineId producedItems workingTime
1 1 105.5 5.5
2 2 205.5 25.5
>
UPDATE(02Dec2017):迴應的意見,即使用一個唯一的順序號來區分數據的行的溶液看起來是這樣。
machine1.workingTime <- 1:10
machine2.workingTime <- 21:30
machine1.producedItems <- 101:110
machine2.producedItems <- 201:210
id <- 1:length(machine1.workingTime)
theData <- data.frame(id,
machine1.producedItems,
machine1.workingTime,
machine2.producedItems,
machine2.workingTime
)
meltedData <- melt(theData,measure.vars=2:5)
head(meltedData)
meltedData$variable <- as.character(meltedData$variable)
# now, extract machine numbers and variable names
variableNames <- strsplit(as.character(meltedData$variable),"[.]")
meltedData$columnName <- unlist(lapply(variableNames,function(x) x[2]))
meltedData$machineId <- as.numeric(unlist(lapply(variableNames,function(x) y <- substr(x[1],8,nchar(x[1])))))
theResult <- dcast(meltedData,machineId + id ~ columnName,value.var="value")
head(theResult)
...和輸出。
head(theResult)
machineId id producedItems workingTime
1 1 1 101 1
2 1 2 102 2
3 1 3 103 3
4 1 4 104 4
5 1 5 105 5
6 1 6 106 6
>
請提供一個代碼示例,包括您的數據幀(或捏造數據類似於您的數據幀),並顯示你有多遠了,並在那裏你卡住了。 –
不清楚這些列是列名還是列中的值。什麼是'MachineNum' – akrun
我認爲您搜索的關鍵字是長格式與寬格式數據以及如何從其他格式轉換。如果您提供示例數據,您可能會得到更好的答案。 – snoram