2014-02-08 162 views
0

我正在嘗試使用我的數據集構建一組GAM模型,但有一些模型必須遵循的特定規則。因此,我試圖使用pdredge和子集來限制爲結果生成的模型。下面是我的代碼有:GAM疏浚子集

# Dredge for GAMs 
# using Ground_10 as dependent variable, R_Data_v9 
# using only observations with no inundation (i.e., xarea=0) 

# To run on Rachel's laptop: 
# source("C:/Users/Davidson/Documents/Dana Anderson/Japan ignition/R scripts/gam_dredge_1.R") 
# or to run on cluster 
# source("gam_dredge_1.R") 

# LOAD LIBRARIES 

library(mgcv) 
library(MuMIn) 
library(MASS) 
library(parallel) 

# READ IN DATA 

# To read it from Dana's computer: 
data<-read.table("C:/.../R_Data_v9_ground_10.txt", header=TRUE) 
#note: deleted actual location of data for security reasons. 



# Select only those observations with no inundation 
data_gm<-subset(data,xarea==0) 
print("data loaded") 
# Assign variable names 
ground_10<-data_gm$ground_10 
xpsa03<-data_gm$xpsa03 
xpgv<-data_gm$xpgv 
xpga<-data_gm$xpga 
xii<-data_gm$xii 
xpsa10<-data_gm$xpsa10 
xpsa30<-data_gm$xpsa30 
xres<-data_gm$xres 
xcom<-data_gm$xcom 
xindus<-data_gm$xindus 
xratio<-data_gm$xratio 
xdam1<-data_gm$xdam1 
xdam2<-data_gm$xdam2 
xdam3<-data_gm$xdam3 
xpdam1<-data_gm$xpdam1 
xpdam2<-data_gm$xpdam2 
xpdam3<-data_gm$xpdam3 
xdam123<-data_gm$xdam123 
xpdam123<-data_gm$xpdam123 
xpop<-data_gm$xpop 
xestab<-data_gm$xestab 
xwood<-data_gm$xwood 

# SET UP CLUSTER 

# Detect number of cores on computer 
detectCores() 

# Determine cluster type (mine is a PSOCK) 
clusterType <- if(length(find.package("snow", quiet = TRUE))) "SOCK" else "PSOCK" 

# Set up a cluster with number of cores specified as result of detectCores() 
# and call it "clust" 
# For laptop with 4 cores 
clust <- makeCluster(getOption("cl.cores", 4), type = clusterType) 


# Load required packages onto worker nodes 
# (in this example, load packages {MASS} and {MuMIn} to be used by pdredg) 
clusterEvalQ(clust,library(mgcv)) 
clusterEvalQ(clust,library(MuMIn)) 

#GAM RUNS: 

gam1<-gam(ground_10~s(xii,k=2)+s(xpga,k=2)+s(xpgv,k=2)+s(xpsa03,k=2)+s(xpsa10,k=2)+s(xpsa30,k=2)+ 
s(xpop,k=2)+s(xres,k=2)+s(xestab,k=2)+s(xcom,k=2)+ 
s(xindus,k=2)+ 
s(xpdam1,k=2)+s(xpdam2,k=2)+s(xpdam3,k=2)+s(xpdam123,k=2)+ 
s(xwood,k=2)+ 
s(xdam1,k=2)+s(xdam2,k=2)+s(xdam123,k=2)) 


# Export data and any objects the modeling function will use 
# into the cluster worker nodes 
clusterExport(clust,c("data_gm","gam1","ground_10", "xii", "xpga", "xpgv", "xpsa03", "xpsa10", "xpsa30", "xpop", "xres", "xestab", "xcom", "xindus","xpdam1", "xpdam2", "xpdam3", "xpdam123", "xwood", "xdam1", "xdam2", "xdam3", "xdam123")) 

# Run pdredge using subsetting so as to allow no more than 1 ground motion covariate at a time 

pdd.gam1<-pdredge(gam1, cluster=clust,  
     subset=(!('s(xii, k=2)'&'s(xpga, k=2)') & !'(s(xii, k=2)' &'s(xpgv, k=2)') & !('s(xii,k=2)'&'s(xpsa03,k=2)') & !('s(xii, k=2)'&'s(xpsa10,k=2)') & !('s(xii, k=2)'&'s(xpsa30,k=2)') & !('s(xpga, k=2)'&'s(xpgv, k=2)')  
       & !('s(xpga, k=2)'&'s(xpsa03,k=2)') & !('s(xpga,k=2)' &'s(xpsa10,k=2)') & !('s(xpga,k=2)'&'s(xpsa30,k=2)') & !('s(xpgv,k=2)'&'s(xpsa03,k=2)') & !('s(xpgv,k=2)'&'s(xpsa10,k=2)') & !('s(xpgv,k=2)'&'s(xpsa30,k=2)')  
       & !('s(xpsa03,k=2)'&'s(xpsa10,k=2)') & !('s(xpsa03,k=2)'&'s(xpsa30,k=2)') & !('s(xpsa10,k=2)'&'s(xpsa30,k=2)')   
       & !('s(xpop,k=2)'&'s(xres,k=2)') & !('s(xpop,k=2)'&'s(xestab,k=2)') & !('s(xpop,k=2)'&'s(xcom,k=2)') & !('s(xres,k=2)'&'s(xestab,k=2)') & !('s(xres,k=2)'&'s(xcom,k=2)') & !('s(xestab,k=2)'&'s(xcom,k=2)')  
       & !('s(xpdam1,k=2)'&'s(xpdam123,k=2)' & !('s(xpdam2,k=2)'&'s(xpdam123,k=2)') & !('s(xpdam3,k=2)'&'s(xpdam123,k=2)')  
      & !('s(xdam1,k=2)'&'s(xdam2,k=2)') & !('s(xdam1,k=2)'&'s(xdam3,k=2)') & !('s(xdam1,k=2)'&'s(xdam123,k=2)') & !('s(xdam2,k=2)'&'s(xdam3,k=2)') & !('s(xdam2,k=2)'& 's(xdam123,k=2)') & !('s(xdam3,k=2)'&'s(xdam123,k=2)')),rank=function(x) summary(x)$sp.criterion, extra=c(GCV=function(x) summary(x)$sp.criterion, "AIC")) 

然而,每次我運行此,它自帶了以下錯誤:

Error in pdredge(gam1, cluster = clust, subset = (!(s(xii,k=2) & s(xpga,k=2)) & : unrecognized names in 'subset' expression: "s(xii,k=2)", "s(xpga,k=2)", "s(xpgv,k=2)", "s(xpsa03,k=2)", "s(xpsa10,k=2)", "s(xpsa30,k=2)", "s(xpop,k=2)", "s(xres,k=2)", "s(xestab,k=2)", "s(xcom,k=2)", "s(xpdam1,k=2)", "s(xpdam123,k=2)", "s(xpdam2,k=2)", "s(xpdam3,k=2)", "s(xdam1,k=2)", "s(xdam2,k=2)", "s(xdam3,k=2)" and "s(xdam123,k=2)"

請幫我揣摩了這一點!我已經嘗試了我可能想到的一切! 謝謝,達納

回答

1

我遇到了類似的問題。我假設你正在按照幫助在dredge()中進行子集化,即:「複合模型術語(如I()內的'as-is'表達式或gam中的平滑表達式)應視爲非語法名稱,蜱,如

subset = ‘s(x, k = 2)‘ || ‘I(log(x))‘ 

我發現我只能得到這個使用工作`(反單引號!),而不是在幫助例子,即,「(這在我看來是一個單一顯示的字符引號)。

另外,疏通()似乎需要你,它內部

即 使用在我的情況下的間距完全匹配,我莫德爾是

M1<-gam(PLNK ~ s(CX, k=5) + s(CHL, k=5) + s(HC, k=5) + s(sqHUM, k=5) 
+ s(sqHDIST, k=5) + s(SSTL, k=5) + s(WV, k=5) + AT, data=z, 
family=Gamma (link=log)) 

我想排除與這兩個WV和SSTL

subset=!(`s(CHL, k = 5)` & `s(WV, k=5)`) 

沒有工作模式,但

subset=!(`s(CHL, k = 5)` & `s(WV, k = 5)`) 

一樣。

我收集疏浚所使用的內部格式,方法是在沒有子集參數的情況下運行挖泥船,然後查看生成的數據框中的模型。

1

您可以在全局模型上使用'getAllTerms'函數以正確的形式列出所有項。