可變長度中的R（具有lme4線性建模）

我的輸入文件不同：可變長度中的R（具有lme4線性建模）

 
Treat1 Treat2 Batch gene1 gene2 
High Low  1  92.73 4.00 
Low  Low  1  101.85 6.00 
High High 1  136.00 4.00 
Low  High 1  104.00 3.00 
High Low  2  308.32 10.00 
Low  Low  2  118.93 3.00 
High High 2  144.47 3.00 
Low  High 2  189.66 4.00 
High Low  3  95.12 2.00 
Low  Low  3  72.08 6.00 
High High 3  108.65 2.00 
Low  High 3  75.00 3.00 
High Low  4  111.39 5.00 
Low  Low  4  119.80 4.00 
High High 4  466.55 11.00 
Low  High 4  125.00 3.00

有數以萬計的附加列的，每一個報頭和號碼的列表，相同的長度，「基因1」柱。

我的代碼：

library(lme4) 
library(lmerTest) 

# Import the data. 
mydata <- read.table("input_file", header=TRUE, sep="\t") 

# Make batch into a factor 
mydata$Batch <- as.factor(mydata$Batch) 

# Check structure 
str(mydata) 

# Get file without the factors, so that names(df) gives gene names. 
genefile <- mydata[c(4:2524)] 

# Loop through all gene names and run the model once per gene and print to file. 
for (i in names(genefile)){ 
    lmer_results <- lmer(i ~ Treat1*Treat2 + (1|Batch), data=mydata) 
    lmer_summary <- summary(lmer_results) 
    write(lmer_summary,file="results_file",append=TRUE, sep="\t", quote=FALSE) 
}

結構：

'data.frame':  16 obs. of 2524 variables: 
$ Treat1   : Factor w/ 2 levels "High","Low": 1 2 1 2 1 2 1 2 1 2 ... 
$ Treat2   : Factor w/ 2 levels "High","Low": 2 2 1 1 2 2 1 1 2 2 ... 
$ Batch   : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 2 2 2 2 3 3 ... 
$ gene1   : num 92.7 101.8 136 104 308.3 ... 
$ gene2   : num 4 6 4 3 10 3 3 4 2 6 ...

我的錯誤消息：

錯誤model.frame.default（數據= MYDATA，降.unused.levels = TRUE，formula = i〜：個可變長度不同（發現 'Treat1'）電話：... 11聚物 - >評估 - >評估 - > - > model.frame.default 執行暫停

我曾試圖檢查所涉及的所有對象並且看不到可變長度的任何差異，並且我也確保沒有缺失數據。使用na.exclude運行它不會改變任何內容。

有什麼想法發生了什麼？

來源

2016-05-26 EKarl

您的起始位置是添加打印語句以確定導致問題的列。由於我們沒有您的數據，因此診斷問題可能相當困難。 – lmo

您的數據中沒有「i」列。因此，'lmer'看起來在數據之外並且發現字符向量'i'。嘗試'lmer_results < - lmer（as.formula（paste0（i，'〜Treat1 * Treat2 +（1 | Batch）「）），data = mydata）' – Roland

@羅蘭的診斷（lmer正在尋找一個叫做我變量，而不是一個變量，其名是i：強制性Lewis Carroll reference）是正確的，我想。處理這種情況最直接的辦法是用reformulate()，是這樣的：

for (i in names(genefile)){ 
    form <- reformulate(c("Treat1*Treat2","(1|Batch)"),response=i) 
    lmer_results <- lmer(form, data=mydata) 
    lmer_summary <- summary(lmer_results) 
    write(lmer_summary,file="results_file", 
      append=TRUE, sep="\t", quote=FALSE) 
}

關於第二個想法，你應該能夠使用顯著加速你的計算內置refit()方法，其改裝的典範一個新的響應變量：假設爲簡單起見，第一基因被稱爲geneAAA：

wfun <- function(x) write(summary(x), 
     file="results_file", append=TRUE, sep="\t",quote=FALSE) 
mod0 <- lmer(geneAAA ~ Treat1*Treat2 + (1|Batch), data=mydata) 
wfun(mod0) 
for (i in names(genefile)[-1]) { 
    mod1 <- refit(mod0,mydata[[i]]) 
    wfun(mod1) 
}

（順便說一句，我不知道你的write()命令做任何事情懂事......）

來源

2016-05-26 15:11:04

Roland的建議奏效。我將write（）更改爲capture.output（）。 – EKarl

可變長度中的R（具有lme4線性建模）

回答

相關問題