0
我試圖在R中線性優化預測的準確性,並且我在找到收斂和方便的答案時遇到困難。R中參數數量高的優化
我的想法如下:我有一套我想優化的32個參數。這32個參數是使用'rnorm'從正態分佈隨機抽取的。
linCoeff <- rnorm(32,0,5)
(linCoeff至於線性係數)。
這些32個參數被組合以下面的方式:
myVal <- (((clSigm*lCoeff[1])+lCoeff[2])*data[,1])+
(((clSigm*lCoeff[3])+lCoeff[4])*data[,2])+
(((clSigm*lCoeff[5])+lCoeff[6])*data[,3])+
(((clSigm*lCoeff[7])+lCoeff[8])*data[,4])+
(((clSigm*lCoeff[9])+lCoeff[10])*data[,5])+
(((clSigm*lCoeff[11])+lCoeff[12])*data[,6])+
(((clSigm*lCoeff[13])+lCoeff[14])*data[,7])+
(((clSigm*lCoeff[15])+lCoeff[16])*data[,8])+
(((clSigm*lCoeff[17])+lCoeff[18])*data[,9])+
(((clSigm*lCoeff[19])+lCoeff[20])*data[,10])+
(((clSigm*lCoeff[21])+lCoeff[22])*data[,11])+
(((clSigm*lCoeff[23])+lCoeff[24])*data[,12])+
(((clSigm*lCoeff[25])+lCoeff[26])*data[,13])+
(((clSigm*lCoeff[27])+lCoeff[28])*data[,14])*data$indDV1+
(((clSigm*lCoeff[29])+lCoeff[30])*data[,15])*data$indDV2+
((clSigm*lCoeff[31])+lCoeff[32])
哪裏有:
clSigm,這是一個固定的參數;
data [,i],這是我data.frame上的值我想總結。
它最後有16個元素的總和,它給了我一個數值:'myVal'。 我然後應用的激活功能,這給
- -1,如果 '設爲myVal'> 0和
- 1如果 '設爲myVal' 是0 <
我然後比較它我的輸入(它是-1和+1的列表)並輸出平衡精度。
我想優化線性的32個參數以找到最大BACC,但使用現有的R方法不給我probant的結果,因爲我從來沒有收斂......
舉的例子中,函數I給的Optim是:
retrieveVal <- function(lCoeff,data){
clSigm <- 1/(1+exp(.5-(data$acc)))
myVal <- (((clSigm*lCoeff[1])+lCoeff[2])*data[,1])+
(((clSigm*lCoeff[3])+lCoeff[4])*data[,2])+
(((clSigm*lCoeff[5])+lCoeff[6])*data[,3])+
(((clSigm*lCoeff[7])+lCoeff[8])*data[,4])+
(((clSigm*lCoeff[9])+lCoeff[10])*data[,5])+
(((clSigm*lCoeff[11])+lCoeff[12])*data[,6])+
(((clSigm*lCoeff[13])+lCoeff[14])*data[,7])+
(((clSigm*lCoeff[15])+lCoeff[16])*data[,8])+
(((clSigm*lCoeff[17])+lCoeff[18])*data[,9])+
(((clSigm*lCoeff[19])+lCoeff[20])*data[,10])+
(((clSigm*lCoeff[21])+lCoeff[22])*data[,11])+
(((clSigm*lCoeff[23])+lCoeff[24])*data[,12])+
(((clSigm*lCoeff[25])+lCoeff[26])*data[,13])+
(((clSigm*lCoeff[27])+lCoeff[28])*data[,14])*data$indDV1+
(((clSigm*lCoeff[29])+lCoeff[30])*data[,15])*data$indDV2+
((clSigm*lCoeff[31])+lCoeff[32])
act <- c(lapply(myVal,FUN=activate))
return(-BACC(inp,act))
}
然後:
optim(par=linCoeff,fn=retrieveVal,data=myData)
如果有人可以幫助在這裏,我所有的聽覺!
在此先感謝。
嗨,非常感謝您的回答!您可以考慮每個數據列的值位於[-3; +3]範圍內,取自正態分佈(難以給出所有數據集,但值接近分佈)。 你甚至可以在嘗試中放棄第14和第15個任期,起初有一些比例因子不太有用。 但至少,很多很多謝謝,我會嘗試這2個包,回來! – GMaxG