應用背景
我有一個隨機斜率和截距模型。有很多層次的隨機效應。新數據(待預測)可能具有或可能不具有所有這些級別。固定和隨機效應庫(ML4)MLM的設計矩陣
爲了使這更具體,我正在專輯級音樂收入(title
)。每個專輯可能有多種類型format2
(CD,乙烯基,電子音頻等)。我在每個專輯類型的每個專輯都有收入測量。該模型被指定爲:
lmer(physical~ format2+ (0+format2|title))
的問題是,未來的數據可能不具有任何title
或format2
的各個層面。對於隨機截取,這可以通過predict(..., allow.new.levels= TRUE)
輕鬆解決。但是對於固定效應和隨機斜率而言,這是有問題的。因此,我試圖編寫一個函數來對merMod
對象進行靈活的預測,類似於lme4::predict.merMod
;但是這將處理訓練數據和預測數據之間的差異。對於lme4::predict.merMod
的具體細節,這是一個無知的問題。
的問題
問題的癥結與固定和隨機效應計算兩種預測和SE的得到正確的model.matrix()
說明。類別merMod
的S3方法返回只有固定效果。
基準stats::model.matrix()
函數的文檔非常有限。不幸的是,我並不擁有Statistical Models in S或Software for Data Analysis,它們似乎具有這些功能背後的細節。
model.matrix()
應該採取一個模型公式和新的數據框架,併產生一個設計矩陣。但是我得到一個錯誤。任何幫助你可以提供將非常感激。
示例數據
dat1 <- structure(list(dt_scale = c(16, 16, 16, 16, 16, 16, 16, 16, 16,
16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16), title = c("Bahia",
"Jazz Moods: Brazilian Romance", "Quintessence", "Amadeus: The Complete Soundtrack Recording (Bicentennial Edition)",
"Live In Europe", "We'll Play The Blues For You", "The Complete Village Vanguard Recordings, 1961",
"The Isaac Hayes Movement", "Jazz Moods: Jazz At Week's End",
"Blue In Green: The Concert In Canada", "The English Patient - Original Motion Picture Soundtrack",
"The Unique Thelonious Monk", "Since We Met", "You're Gonna Hear From Me",
"The Colors Of Latin Jazz: Cubop!", "The Colors Of Latin Jazz: Samba!",
"Homecoming", "Consecration: The Final Recordings Part 2 - Live At Keystone Korner, September 1980", "More Creedence Gold", "The Stardust Session"), format2 = c("CD", "CD",
"CD", "CD", "CD", "CD", "CD", "SuperAudio", "SuperAudio", "CD", "E Audio", "CD",
"Vinyl", "CD", "E Audio", "CD", "CD", "CD", "CD", "CD"), mf_day = c(TRUE,
TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,
TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE), xmas = c(FALSE,
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,
FALSE), vday = c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,
FALSE, FALSE, FALSE, FALSE), yr_since_rel = c(16.9050969937038,
8.41815617876864, 9.2991404674865, 25.0870296783559, 39.1267038232812,
27.9156764326061, 9.11596751812513, 23.3052837112449, 14.3123922258974,
30.5208152866414, 5.83025071417496, 21.3090003877291, 7.75022155568392,
11.3601605287827, 0.849006673421519, 31.9918631305662, 13.8861905547041,
12.8342695062012, 29.6916671402534, 13.5912612705038), physical = c(1327.17849171096,
-110.2265302258, -795.37376268564, 355.06192702004, -1357.3492884345,
-1254.93442612023, -816.713683621225, 881.201935773452, -3092.02845691036,
-2268.6304275652, 907.347941142021, -699.130275178185, 377.867849132077,
-1047.50531157311, 1460.25978951805, 1376.84579069304, 3619.03629114089,
962.888173535704, 2514.77880599199, 2539.14958588771)), .Names = c("dt_scale",
"title", "format2", "mf_day", "xmas", "vday", "yr_since_rel",
"physical"), row.names = c(1L, 2L, 5L, 6L, 7L, 8L, 9L, 11L, 12L,
13L, 14L, 15L, 20L, 22L, 23L, 25L, 27L, 32L, 35L, 36L), class = "data.frame")
式:
f1 <- as.formula(~1 + dt_scale + yr_since_rel + format2 + (0 + format2 + mf_day +
xmas + vday | title))
執行/錯誤
library(lme4)
model.matrix(f1, data= dat1)
Error in 0 + format2 : non-numeric argument to binary operator
注意 我也試過這個與Orthodont
數據;但是,我收到了一個不同的錯誤。
library(lme4)
data("Orthodont",package="MEMSS")
fm1 <- lmer(formula = distance ~ age*Sex + (1+age|Subject), data = Orthodont)
newdat <- expand.grid(
age=c(8,10,12,14)
, Sex=c("Male","Female")
, distance = 0
, Subject= c("F01", "F02")
)
f1 <- formula(fm1)[-2] # simpler code via Ben Bolker below
mm <- model.matrix(f1, newdat) # attempt to use model.matrix
Warning message
In Ops.factor(1 + age, Subject) : | not meaningful for factors
# use lme4:::mkNewReTrms as suggested in comments
mm <- lme4:::mkNewReTrms(f1, newdat)
Error in lme4:::mkNewReTrms(f1, newdat) : object 'ReTrms' not found
In addition: Warning message:
In Ops.factor(1 + age, Subject) : | not meaningful for factors
# check if different syntax would fix this
mm <- lme4::mkNewReTrms(f1, newdat)
Error: 'mkNewReTrms' is not an exported object from 'namespace:lme4'
mm <- mkNewReTrms(f1, newdat)
Error: could not find function "mkNewReTrms"
我有幾個問題/評論。 (1)您包含的樣本數據只有'format2'的單個值,因此指定的模型不起作用(假定您的實際數據更多)。 (2)你的例子不可重複地繼續;什麼是「b1」和「a」? (3)你的'f1'公式看起來很可疑;那麼所有這些效應都會在「title」級別內發生變化,並且您是否有足夠的數據來估計所有這些效應中的(相關的)標題間變異性? (4)在預測的新數據中缺少**固定或隨機效應水平的水平不是問題;如果你想構造一個新的隨機效應模型矩陣,你可以使用'lme4 ::: mkNewReTrms(object,newdata,re.form)',其中'object'是一個對象,它是一個額外的**級別,很麻煩 – 2014-08-27 23:27:59