我正在嘗試在C#中執行偏最小二乘迴歸分析。在MATLAB中執行的pls技術使用提供β(迴歸係數矩陣)的SIMPLS算法。MATLAB和C#中的PLS迴歸係數(Accord.NET)
我不明白爲什麼矩陣在兩種情況下都不同,在將輸入傳遞給C#版本的方式中是否存在一些錯誤?
另外,輸入對於兩者都是相同的,並且參照這裏包括的論文。
最小工作示例:
MATLAB:以下通過埃爾韋阿卜迪的小例子(埃爾韋阿卜迪,偏最小二乘迴歸)。參考文獻:PDF
clear all;
clc;
inputs = [7, 7, 13, 7; 4, 3, 14, 7; 10, 5, 12, 5; 16, 7, 11, 3; 13, 3, 10, 3];
outputs = [14, 7, 8; 10, 7, 6; 8, 5, 5; 2, 4,7; 6, 2, 4];
[XL,yl,XS,YS,beta,PCTVAR] = plsregress(inputs,outputs, 1);
disp 'beta'
beta
disp 'beta size'
size(beta)
yfit = [ones(size(inputs,1),1) inputs]*beta;
residuals = outputs - yfit;
% stem(residuals)
% xlabel('Observation');
% ylabel('Residual');
beta =
1.0484e+01 6.1899e+00 6.2841e+00
-6.3488e-01 -3.0405e-01 -7.2608e-02
2.1949e-02 1.0512e-02 2.5102e-03
1.9226e-01 9.2078e-02 2.1988e-02
2.8948e-01 1.3864e-01 3.3107e-02
Accord.NET:
double[][] inputs = new double[][]
{
// Wine | Price | Sugar | Alcohol | Acidity
new double[] { 7, 7, 13, 7 },
new double[] { 4, 3, 14, 7 },
new double[] { 10, 5, 12, 5 },
new double[] { 16, 7, 11, 3 },
new double[] { 13, 3, 10, 3 },
};
double[][] outputs = new double[][]
{
// Wine | Hedonic | Goes with meat | Goes with dessert
new double[] { 14, 7, 8 },
new double[] { 10, 7, 6 },
new double[] { 8, 5, 5 },
new double[] { 2, 4, 7 },
new double[] { 6, 2, 4 },
};
var pls = new PartialLeastSquaresAnalysis()
{
Method = AnalysisMethod.Center,
Algorithm = PartialLeastSquaresAlgorithm.NIPALS
};
var regression = pls.Learn(inputs, outputs);
double[][] coeffs = regression.Weights;
>>
-1.69811320754717 -0.0566037735849056 0.0707547169811322
1.27358490566038 0.29245283018868 0.571933962264151
-4 1 0.5
1.17924528301887 0.122641509433962 0.159198113207547