2013-02-05 71 views
2

我遇到了一個棘手的斑點試圖解決方差佔單個數據集內的趨勢幾倍內的幾個趨勢線性迴歸.....R-平方值,一個數據集

我的數據的結構是這樣

x <- read.table(text = " 
STA YEAR VALUE 
a 1968 457 
a 1970 565 
a 1972 489 
a 1974 500 
a 1976 700 
a 1978 650 
a 1980 659 
b 1968 457 
b 1970 565 
b 1972 350 
b 1974 544 
b 1976 678 
b 1978 650 
b 1980 690 
c 1968 457 
c 1970 565 
c 1972 500 
c 1974 600 
c 1976 678 
c 1978 670 
c 1980 750 " , header = T)  

,我試圖返回這樣的事情

STA R-sq 
a n1 
b n2 
c n3 

其中n#是在原來的集合中的位置數據的相應的R平方值....

我已經試過

fit <- lm(VALUE ~ YEAR + STA, data = x) 

給對歷年數據的每一個人樁號值的年趨勢的模型可用於值時,主數據集內....

任何幫助將不勝感激....我真的難住這一個,我知道這只是熟悉R問題。

+1

這裏有你想要做什麼,我想:http://stackoverflow.com/a/1214432/1036500 – Ben

回答

0

,只有一個R平方值,而不是三個。請編輯您的問題

# store the output 
y <- summary(lm(VALUE ~ YEAR + STA , data = x)) 
# access the attributes of `y` 
attributes(y) 
y$r.squared 
y$adj.r.squared 
y$coefficients 
y$coefficients[,1] 

# or are you looking to run three separate 
# lm() functions on 'a' 'b' and 'c' ..where this would be the first? 
y <- summary(lm(VALUE ~ YEAR , data = x[ x$STA %in% 'a' , ])) 
# access the attributes of `y` 
attributes(y) 
y$r.squared 
y$adj.r.squared 
y$coefficients 
y$coefficients[,1] 
2

要獲得R平方爲VALUEYEAR每組STA,你可以藉此previous answer,修改略和插件你的價值觀:

# assuming x is your data frame (make sure you don't have Hmisc loaded, it will interfere) 
models_x <- dlply(x, "STA", function(df) 
    summary(lm(VALUE ~ YEAR, data = df))) 

# extract the r.squared values 
rsqds <- ldply(1:length(models_x), function(x) models_x[[x]]$r.squared) 
# give names to rows and col 
rownames(rsqds) <- unique(x$STA) 
colnames(rsqds) <- "rsq" 
# have a look 
rsqds 
     rsq 
a 0.6286064 
b 0.5450413 
c 0.8806604 

編輯:按照這裏MNEL的建議是更有效的方式來獲得R平方值成漂亮的表(無需加ROW和COL名):

# starting with models_x from above 
rsqds <- data.frame(rsq =sapply(models_x, '[[', 'r.squared')) 

# starting with just the original data in x, this is great: 
rsqds <- ddply(x, "STA", summarize, rsq = summary(lm(VALUE ~ YEAR))$r.squared) 

    STA  rsq 
1 a 0.6286064 
2 b 0.5450413 
3 c 0.8806604 
+1

或者更簡單地說' rsqds < - data.frame(rsq = sapply(models_x,'[[','r.squared'))',或單線程'models_x < - ddply(x,「STA」,summary,rsq = summary (lm(VALUE〜YEAR))$ r.squared)' – mnel

+0

啊,是的,這太好了,謝謝你的提示!我會更新我的答案,包括那些 – Ben

+0

太棒了。 ddply和dlply在哪裏。萬分感謝。 – user1680636

1
#first load the data.table package 
     library(data.table) 
    #transform your dataframe to a datatable (I'm using your example) 
     x<- as.data.table(x) 
    #calculate all the metrics needed (r^2, F-distribution and so on) 
     x[,list(r2=summary(lm(VALUE~YEAR))$r.squared , 
     f=summary(lm(VALUE~YEAR))$fstatistic[1]),by=STA] 
      STA  r2   f 
     1: a 0.6286064 8.462807 
     2: b 0.5450413 5.990009 
     3: c 0.8806604 36.897258 
+0

A類似的問題與更多的答案在這裏:http://stackoverflow.com/questions/1169539/linear-regression-and-group-by-in-r/33754058#33754058 – FraNut

相關問題