2012-06-20 46 views
0

我有我正在處理的調查數據。我需要對數據進行一些表格和迴歸分析。 附數據之後,這是我使用的表格提供了四個變量的代碼:調查數據的交叉表(加權和未加權)

ftable(VAR1,VAR2,VAR3,VAR4)

這是,我使用的迴歸代碼數據:

logit.1 < - GLM(VAR4〜VAR3 + VAR2 + VAR1,家族=二項式(聯繫= 「分對數」)) 摘要(logit.1)

迄今爲止對於未加權的分析非常有用。但是我怎樣才能對加權數據進行相同的分析?以下是一些附加信息: 數據集中有四個變量反映抽樣結構。這些是

階層:階層(城市或(分縣)農村)。

clust:批次是相同的隨機遊走的一部分

vill_neigh_code訪談:村或居委會代碼

sweight:權重

回答

0
library(survey) 

data(api) 

# example data set 
head(apiclus2) 

# instead of var1 - var4, use these four variables: 
ftable(apiclus2[ , c('sch.wide' , 'comp.imp' , 'both' , 'awards') ]) 

# move it over to x for faster typing 
x <- apiclus2 


# also give x a column of all ones 
x$one <- 1 

# run the glm() function specified. 
logit.1 <- 
    glm( 
     comp.imp ~ target + cnum + growth , 
     data = x , 
     family = binomial(link = 'logit') 
    ) 

summary(logit.1) 

# now create the survey object you've described 
dclus <- 
    svydesign(
     id = ~dnum + snum , # cluster variable(s) 
     strata = ~stype , # stratum variable 
     weights = ~pw ,  # weight variable 
     data = x , 
     nest = TRUE 
    ) 

# weighted counts 
svyby( 
    ~one , 
    ~ sch.wide + comp.imp + both + awards , 
    dclus , 
    svytotal 
) 


# weighted counts formatted differently 
ftable(
    svyby( 
     ~one , 
     ~ sch.wide + comp.imp + both + awards , 
     dclus , 
     svytotal , 
     keep.var = FALSE 
    ) 
) 


# run the svyglm() function specified. 
logit.2 <- 
    svyglm( 
     comp.imp ~ target + cnum + growth , 
     design = dclus , 
     family = binomial(link = 'logit') 
    ) 

summary(logit.2)