0
我使用SVM Rank,它有多個參數,改變我得到各種結果的人。根據驗證集上的最佳結果進行調整,是否有一些調整和獲取最佳參數的機制?如何調整SVM等級的參數?
以下是在不同的參數:
Learning Options:
-c float -> C: trade-off between training error
and margin (default 0.01)
-p [1,2] -> L-norm to use for slack variables. Use 1 for L1-norm,
use 2 for squared slacks. (default 1)
-o [1,2] -> Rescaling method to use for loss.
1: slack rescaling
2: margin rescaling
(default 2)
-l [0..] -> Loss function to use.
0: zero/one loss
?: see below in application specific options
(default 1)
Optimization Options (see [2][5]):
-w [0,..,9] -> choice of structural learning algorithm (default 3):
0: n-slack algorithm described in [2]
1: n-slack algorithm with shrinking heuristic
2: 1-slack algorithm (primal) described in [5]
3: 1-slack algorithm (dual) described in [5]
4: 1-slack algorithm (dual) with constraint cache [5]
9: custom algorithm in svm_struct_learn_custom.c
-e float -> epsilon: allow that tolerance for termination
criterion (default 0.001000)
-k [1..] -> number of new constraints to accumulate before
recomputing the QP solution (default 100)
(-w 0 and 1 only)
-f [5..] -> number of constraints to cache for each example
(default 5) (used with -w 4)
-b [1..100] -> percentage of training set for which to refresh cache
when no epsilon violated constraint can be constructed
from current cache (default 100%) (used with -w 4)
SVM-light Options for Solving QP Subproblems (see [3]):
-n [2..q] -> number of new variables entering the working set
in each svm-light iteration (default n = q).
Set n < q to prevent zig-zagging.
-m [5..] -> size of svm-light cache for kernel evaluations in MB
(default 40) (used only for -w 1 with kernels)
-h [5..] -> number of svm-light iterations a variable needs to be
optimal before considered for shrinking (default 100)
-# int -> terminate svm-light QP subproblem optimization, if no
progress after this number of iterations.
(default 100000)
Kernel Options:
-t int -> type of kernel function:
0: linear (default)
1: polynomial (s a*b+c)^d
2: radial basis function exp(-gamma ||a-b||^2)
3: sigmoid tanh(s a*b + c)
4: user defined kernel from kernel.h
-d int -> parameter d in polynomial kernel
-g float -> parameter gamma in rbf kernel
-s float -> parameter s in sigmoid/poly kernel
-r float -> parameter c in sigmoid/poly kernel
-u string -> parameter of user defined kernel
謝謝@ | V | ad。你能否給出澄清「對於文本分類,你應該選擇線性內核」? – 2015-02-07 06:16:32
@BitManipulator - 我的意思是,在文獻中衆所周知,對於文本分類,這些實例在由一包詞語模型產生的高維度中(幾乎)可線性分離。所以線性內核表現最好,其他人嘗試沒有意義。不嘗試多個內核意味着您只需調整一個參數即可,爲您節省大量時間。 – IVlad 2015-02-07 10:00:45