1
這是我在此處詢問的問題之後的後續問題:Is there an equivalent function for anova.lm() in Java?。在這些答案的幫助下,我可以在Java中獲得與R中相同的結果,並在截獲兩個線性模型的Anova。但是,當我從線性模型中刪除截距時,殘差平方和相同,但Java和R中的p值不同。當使用無攔截時Java中不同於Java的線性模型的Anova
當截距被移除時,FD分佈是否應該被計算出來?
R代碼裏面
test_trait <- c(-0.48812477 , 0.33458213, -0.52754476, -0.79863471, -0.68544309, -0.12970239, 0.02355622, -0.31890850,0.34725819 , 0.08108851)
geno_A <- c(1, 0, 1, 2, 0, 0, 1, 0, 1, 0)
geno_B <- c(0, 0, 0, 1, 1, 0, 0, 0, 0, 0)
fit <- lm(test_trait ~ geno_A+geno_B)
fit2 <- lm(test_trait ~ geno_A + geno_B + geno_A:geno_B)
anova(fit, fit2)
# Res.Df RSS Df Sum of Sq F Pr(>F)
# 1 7 0.77982
# 2 6 0.77053 1 0.0092897 0.0723 0.797
fit <- lm(test_trait ~ geno_A+geno_B -1)
fit2 <- lm(test_trait ~ geno_A + geno_B + geno_A:geno_B-1)
anova(fit, fit2)
# Res.Df RSS Df Sum of Sq F Pr(>F)
# 1 8 0.78539
# 2 7 0.77080 1 0.014593 0.1325 0.7266
的Java
double[] y = {-0.48812477, 0.33458213, -0.52754476, -0.79863471, -0.68544309, -0.12970239, 0.02355622, -0.31890850, 0.34725819, 0.08108851};
double[][] x = {{1,0}, {0,0}, {1,0}, {2,1}, {0,1}, {0,0}, {1,0}, {0,0}, {1,0}, {0,0}};
double[][] xb = {{1,0,0}, {0,0,0}, {1,0,0}, {2,1,2}, {0,1,0}, {0,0,0}, {1,0,0}, {0,0,0}, {1,0,0}, {0,0,0}};
OLSMultipleLinearRegression regr = new OLSMultipleLinearRegression();
regr.newSampleData(y, x);
double sumOfSquaresModelA = regr.calculateResidualSumOfSquares();
regr.newSampleData(y, xb);
double sumOfSquaresModelB = regr.calculateResidualSumOfSquares();
int degreesOfFreedomA = y.length - (x[0].length + 1);
int degreesOfFreedomB = y.length - (xb[0].length + 1);
double MSE = sumOfSquaresModelB/degreesOfFreedomB;
System.out.printf("RSS intercept: %f\n",sumOfSquaresModelB);
int degreesOfFreedomDifference = Math.abs(degreesOfFreedomB - degreesOfFreedomA);
double MSEdiff = Math.abs((sumOfSquaresModelB - sumOfSquaresModelA)/(degreesOfFreedomDifference));
double Fval = MSEdiff/MSE;
FDistribution Fdist = new FDistribution(degreesOfFreedomDifference, degreesOfFreedomB);
double pval = 1 - Fdist.cumulative(Fval);
System.out.printf("pval with intercept: %f\n",pval);
regr.setNoIntercept(true);
regr.newSampleData(y, x);
double sumOfSquaresNoInterceptA = regr.calculateResidualSumOfSquares();
regr.newSampleData(y, xb);
double sumOfSquaresNoInterceptB = regr.calculateResidualSumOfSquares();
MSE = sumOfSquaresNoInterceptB/degreesOfFreedomB;
System.out.printf("RSS no intercept: %f\n",sumOfSquaresNoInterceptB);
degreesOfFreedomDifference = Math.abs(degreesOfFreedomB - degreesOfFreedomA);
MSEdiff = Math.abs((sumOfSquaresNoInterceptB - sumOfSquaresNoInterceptA)/(degreesOfFreedomDifference));
Fval = MSEdiff/MSE;
Fdist = new FDistribution(degreesOfFreedomDifference, degreesOfFreedomB);
pval = 1 - Fdist.cumulative(Fval);
System.out.printf("pval without intercept: %f",pval);
結果
RSS intercept: 0.770528 //correct
pval with intercept: 0.796973 //correct
RSS no intercept: 0.770799 //correct
pval without intercept: 0.747564 //wrong