R：在線性組合中添加常量，glht（）

所以我想複製一個stata函數，我在中看到的計量經濟學原理，Hill，Griffiths和Lim。我想要複製的函數在stata中看起來像這樣;R：在線性組合中添加常量，glht（）

lincom _cons + b_1 * [arbitrary value] - c

這是零假設H0：B0 + B1 * X = C

我能夠檢驗假設沒有恆定的，但我想測試線性組合時添加的恆定的參數。我瀏覽了glht()的包裝文件，但它只是一個例子，他們拿出了常數。我重複了這個例子，保持不變，但我不確定如何測試一個線性組合，當你有一個矩陣K和一個常量時。作爲參考，這裏是他們的例子;

### multiple linear model, swiss data 
lmod <- lm(Fertility ~ ., data = swiss) 

### test of H_0: all regression coefficients are zero 
### (ignore intercept) 

### define coefficients of linear function directly 
K <- diag(length(coef(lmod)))[-1,] 
rownames(K) <- names(coef(lmod))[-1] 
K 

### set up general linear hypothesis 
glht(lmod, linfct = K)

我不擅長創建假冒數據集，但這裏是我的嘗試。

library(multcomp) 
test.data = data.frame(test.y = seq(200,20000,1000), 
        test.x = seq(10,1000,10)) 
test.data$test.y = sort(test.data$test.y + rnorm(100, mean = 10000, sd = 100)) - 
    rnorm(100, mean = 5733, sd = 77) 
test.lm = lm(test.y ~ test.x, data = test.data) 

# to view the name of the coefficients 
coef(test.lm) 

# this produces an error. How can I add this intercept? 
glht(test.lm, 
linfct = c("(Intercept) + test.x = 20"))

根據文檔，似乎有兩種方法可以解決這個問題。我可以使用函數diag（）來構造一個矩陣，然後我可以在linfct =參數中使用該矩陣，或者我可以使用字符串。用這種方法的事情是，我不知道如何使用diag（）方法，同時也包括常量（方程的右邊）;在字符串方法的情況下，我不知道如何添加截取。

任何和所有的幫助將不勝感激。

這是我正在使用的數據。這最初是在一個.dta文件中，所以我爲可怕的格式道歉。根據我上面提到的這本書，這是food.dta文件。

structure(list(food_exp = structure(c(115.22, 135.98, 119.34, 
114.96, 187.05, 243.92, 267.43, 238.71, 295.94, 317.78, 216, 
240.35, 386.57, 261.53, 249.34, 309.87, 345.89, 165.54, 196.98, 
395.26, 406.34, 171.92, 303.23, 377.04, 194.35, 213.48, 293.87, 
259.61, 323.71, 275.02, 109.71, 359.19, 201.51, 460.36, 447.76, 
482.55, 438.29, 587.66, 257.95, 375.73), label = "household food expenditure per week", format.stata = "%10.0g"), 
income = structure(c(3.69, 4.39, 4.75, 6.03, 12.47, 12.98, 
14.2, 14.76, 15.32, 16.39, 17.35, 17.77, 17.93, 18.43, 18.55, 
18.8, 18.81, 19.04, 19.22, 19.93, 20.13, 20.33, 20.37, 20.43, 
21.45, 22.52, 22.55, 22.86, 24.2, 24.39, 24.42, 25.2, 25.5, 
26.61, 26.7, 27.14, 27.16, 28.62, 29.4, 33.4), label = "weekly household income", format.stata = "%10.0g")), .Names = c("food_exp","income"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -40L))

來源

2017-09-02 im2wddrf

讓我們從你的書中加載數據，然後檢查我們的結果，以確保我們得到同樣的結果。我以這種方式向你提出答案，部分原因是它幫助我準確理解你的想法，並部分地讓你相信這裏的等同性。

對我來說，困惑的一部分是你的lincom例子的語法。你的語法可能是正確的，我不知道，但根據它的外觀我認爲你做了一些不同的事情，因此提到你的書確實有幫助。

首先，讓我們來加載數據和運行是在第115頁的線性模型：

install.packages("devtools") # if not already installed 
library(devtools) 
install_git("https://github.com/ccolonescu/PoEdata") 

library(PoEdata) # loads the package in memory 
library(multcomp) # for hypo testing 
data(food)   # loads the data set of interest 

# EDA 
summary(food) 

# Model 
mod <- lm(food_exp ~ income, data = food) 
summary(mod) # Note: same results as PoE 4th ed. Pg 115 (other than rounding)

Call: 
lm(formula = food_exp ~ income, data = food) 

Residuals: 
    Min  1Q Median  3Q  Max 
-223.025 -50.816 -6.324 67.879 212.044 

Coefficients: 
      Estimate Std. Error t value Pr(>|t|)  
(Intercept) 83.416  43.410 1.922 0.0622 . 
income  10.210  2.093 4.877 1.95e-05 *** 
--- 
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 89.52 on 38 degrees of freedom 
Multiple R-squared: 0.385, Adjusted R-squared: 0.3688 
F-statistic: 23.79 on 1 and 38 DF, p-value: 1.946e-05

到目前爲止，一切都很好。在Pg。在第四版115中，除了一些小的舍入差異之外，它顯示相同的迴歸模型。

其次，該書計算每週食品支出的點估計，在20的家庭收入條件（這是$ 2,000 /周）：

# Point estimate 
predict(mod, newdata = data.frame(income=20))

 1 
287.6089

再次，我們得到的完全相同的結果。順便提一下，您還可以在Wiley的書Using Stata for Principles of Econometrics 4th ed.的精美免費樣本中看到Stata中的相同結果。

最後，我們準備好進行假設檢驗。如前所述，我想確保我能夠完全複製Stata所具有的功能。您好心提供了您的代碼，但我對您的語法有點困惑。

幸運的是，我們很幸運。雖然第4版的Stata指南的預覽只經過第2章，在荷蘭一所大學的Economics and Business faculty都能夠找到一箇舊版本的部件提供DRM免費的，所以我們可以指：

終於看到，我們可以複製它R中是這樣的：

# Hyothesis Test 
summary(glht(mod, linfct = c("income = 15")))

不要被迪菲上當租金輸出格式。它在R代碼中顯示的estimate僅僅是迴歸模型中income的係數（「b2」）。它不會根據假設檢驗而改變，而在Stata輸出中它們會做「b2 - 15」（其中R爲mod$coefficients[2]-15）。

什麼是t（t value）和p（Pr(>|t|)）值。請注意，R的這些測試統計數據與Stata的測試統計數據相匹配。

有收入的H0另一示例= 7.5讓我們看到，t值是1.29且p值是在兩個R 0.203和塔塔：

summary(glht(mod, linfct = c("income = 7.5")))

Simultaneous Tests for General Linear Hypotheses 

Fit: lm(formula = food_exp ~ income, data = food) 

Linear Hypotheses: 
       Estimate Std. Error t value Pr(>|t|) 
income == 7.5 10.210  2.093 1.294 0.203 
(Adjusted p values reported -- single-step method)

您還可以通過confint()獲得置信區間。

最後，我想你在看你的書，第3.6.4（第117），其中一位高管要測試的假設，給出20（$ 2000 /周）的food_exp是income> 250：

我們可以中的R計算出t值爲：

t = sum((mod$coefficients[1] + 20*mod$coefficients[2]-250)/sqrt(vcov(mod)[1] + 20^2 * vcov(mod)[4] + 2 * 20 * vcov(mod)[2])) 
t 
[1] 2.652613

當式是相同的上前面的2頁的書。

我們甚至可以製作成自定義函數這個（適用於簡單線性迴歸，這意味着1個獨立變量只）：

hypo_tester <- function(expenditure, income_per_week_hundreds, mod){ 
    t = sum((mod$coefficients[1] + 
      income_per_week_hundreds*mod$coefficients[2]-expenditure)/sqrt(vcov(mod)[1] + 
      income_per_week_hundreds^2 * vcov(mod)[4] + 2 * income_per_week_hundreds * vcov(mod)[2])) 
    return(t) 
} 

hypo_tester(250, 20, mod) 
[1] 2.652613 
hypo_tester(200, 20, mod) 
[1] 6.179193 
hypo_tester(300, 20, mod) 
[1] -0.8739668

來源

2017-09-03 16:28:05

半句正是我需要（在117-118頁上的例子）！非常感謝。所以我認爲，運行一個假設，但是如果我包含一個截取beta的參數，那麼R中的參數的線性組合可能會非常棘手。但是你的手工功能看起來非常直觀！我分享你的觀點，即R可以完成Stata所做的大部分工作，這也促使我首先提出這個問題。非常感謝。問題回答了。 – im2wddrf

@ im2wddrf樂於助人 –

R：在線性組合中添加常量，glht（）

回答

相關問題