2016-07-11 403 views
0

我目前正在使用* csv中的數據。我已經有了一個有效的腳本來繪製我的數據,但是我似乎是最簡單的任務。我試圖編寫一個腳本,它接受我的數據(按列排列),並通過列計算平均值並將其寫入新文檔(./ testAVG)。此外,我試圖採取相同的數據,計算SD(按列)並將該數據附加到原始文檔的末尾(最好重複一遍我所擁有的數據行的總數) 。計算並寫入R中列的平均值和標準差

這裏的劇本我到目前爲止有:

#Number of lines with data 
Nlines = 5 
#Number of lines to skip 
Nskip = 0 

chem <- read.table("./test.csv", skip=Nskip, sep=",", col.names = c("Sample", "SiO2", "Al2O3", "FeO", "MgO", "CaO", "Na2O", "K2O", "Total", "eSiO2", "eAl2O3", "eFeO", "eMgO", "eCaO", "eNa2O", "eK2O"), fill=TRUE, header = TRUE, nrow=Nlines) 

sd1 <- sd(chem$SiO2) 
sd2 <- sd(chem$Al2O3) 
sd3 <- sd(chem$FeO) 
sd4 <- sd(chem$MgO) 
sd5 <- sd(chem$CaO) 
sd6 <- sd(chem$Na2O) 
sd7 <- sd(chem$K2O) 

avg1 <- colMeans(chem$SiO2, na.rm = FALSE, dims=1) 
avg2 <- colMeans(chem$Al2O3, na.rm = FALSE, dims=1) 
avg3 <- colMeans(chem$FeO, na.rm = FALSE, dims=1) 
avg4 <- colMeans(chem$MgO, na.rm = FALSE, dims=1) 
avg5 <- colMeans(chem$CaO, na.rm = FALSE, dims=1) 
avg6 <- colMeans(chem$Na2O, na.rm = FALSE, dims=1) 
avg7 <- colMeans(chem$K2O, na.rm = FALSE, dims=1) 

write <- write.table(sd1,sd2,sd3,sd4,sd5,sd6,sd7, file="./test.csv", append=TRUE, sep=",", dec=".", col.names = c("eSiO2", "eAl2O3", "eFeO", "eMgO", "eCaO", "eNa2O", "eK2O")) 

write <- write.table(avg1, avg2, avg3, avg4, avg5, avg6, avg7, file="./testAVG.csv", append=FALSE, sep=",", dec=".", col.names = c("Sample", "SiO2", "Al2O3", "FeO", "MgO", "CaO", "Na2O", "K2O", "Total")) 

我正在使用的數據是這樣的

Sample, SiO2, Al2O3, FeO, MgO, CaO, Na2O, K2O, Total,eSiO2,eAl2O3,eFeO,eMgO,eCaO,eNa2O,eK2O 
01,65.01,14.77,0.34,1.31,17.27,1.14,0.2,100,,,,,,, 
02,72.6,16.27,0.53,0.06,1.27,5.55,3.71,100,,,,,,, 
03,64.95,14.65,0.18,1.29,17.48,1.21,0.23,100,,,,,,, 
04,64.95,14.65,0.18,1.29,17.48,1.21,0.23,100,,,,,,, 

我得到這個錯誤:

Error in colMeans(chem$SiO2, na.rm = FALSE, dims = 1) : 
    'x' must be an array of at least two dimensions 

有什麼建議?由於

+0

你不能叫'write.table'這樣的:'write.table(SD1,SD2,SD3,SD4,SD5,SD6, SD7,...)';它需要一個單獨的對象('x')來編寫。另外,你正在一個矢量上使用'colMeans'('colMeans(chem $ SiO2,...)'),但它期望一個數組。你真的應該閱讀文檔('?write.table','?colMeans'),這是有原因的。 – nrussell

+1

使用'apply' /'lapply' /'sapply'類型的函數。 – user2100721

+0

'colMeans'期待矩陣或'data.frame'不是矢量。運行'colMeans(chem)'獲取所有列的方法。 –

回答

1

的評論已經暗示瞭如何做到這一點,但似乎你是相當新的R,所以讓我明確地告訴你如何能更好地做到這一點,使用mtcars數據集:

df <- mtcars 

df_sd <- apply(df, 2, sd) # this is how to use apply. See ?apply 
df_avg <- colMeans(df) # this is how to use colMeans. See ?colMeans 

write.table(df_sd, file="test.csv")  # no assignment necessary. 
write.table(df_avg, file="testAVG.csv") # writing the file is a desired side effect... 

此外,請考慮以下行:

avg1 <- colMeans(chem$SiO2, na.rm = FALSE, dims=1) 

很酷的事情有關colMeans的是,它計算縱列同時意味着許多列。在這裏,你只提供一個矢量,即chem$SiO2。如果這真的是你想要做什麼,你就只寫

avg1 <- mean(chem$SiO2) 
相關問題