如何調整SVM等級的參數？

我使用SVM Rank，它有多個參數，改變我得到各種結果的人。根據驗證集上的最佳結果進行調整，是否有一些調整和獲取最佳參數的機制？如何調整SVM等級的參數？

以下是在不同的參數：

Learning Options: 
    -c float -> C: trade-off between training error 
        and margin (default 0.01) 
    -p [1,2] -> L-norm to use for slack variables. Use 1 for L1-norm, 
        use 2 for squared slacks. (default 1) 
    -o [1,2] -> Rescaling method to use for loss. 
        1: slack rescaling 
        2: margin rescaling 
        (default 2) 
    -l [0..] -> Loss function to use. 
        0: zero/one loss 
        ?: see below in application specific options 
        (default 1) 
Optimization Options (see [2][5]): 
    -w [0,..,9] -> choice of structural learning algorithm (default 3): 
        0: n-slack algorithm described in [2] 
        1: n-slack algorithm with shrinking heuristic 
        2: 1-slack algorithm (primal) described in [5] 
        3: 1-slack algorithm (dual) described in [5] 
        4: 1-slack algorithm (dual) with constraint cache [5] 
        9: custom algorithm in svm_struct_learn_custom.c 
    -e float -> epsilon: allow that tolerance for termination 
        criterion (default 0.001000) 
    -k [1..] -> number of new constraints to accumulate before 
        recomputing the QP solution (default 100) 
        (-w 0 and 1 only) 
    -f [5..] -> number of constraints to cache for each example 
        (default 5) (used with -w 4) 
    -b [1..100] -> percentage of training set for which to refresh cache 
        when no epsilon violated constraint can be constructed 
        from current cache (default 100%) (used with -w 4) 
SVM-light Options for Solving QP Subproblems (see [3]): 
    -n [2..q] -> number of new variables entering the working set 
        in each svm-light iteration (default n = q). 
        Set n < q to prevent zig-zagging. 
    -m [5..] -> size of svm-light cache for kernel evaluations in MB 
        (default 40) (used only for -w 1 with kernels) 
    -h [5..] -> number of svm-light iterations a variable needs to be 
        optimal before considered for shrinking (default 100) 
    -# int  -> terminate svm-light QP subproblem optimization, if no 
        progress after this number of iterations. 
        (default 100000) 
Kernel Options: 
    -t int  -> type of kernel function: 
        0: linear (default) 
        1: polynomial (s a*b+c)^d 
        2: radial basis function exp(-gamma ||a-b||^2) 
        3: sigmoid tanh(s a*b + c) 
        4: user defined kernel from kernel.h 
    -d int  -> parameter d in polynomial kernel 
    -g float -> parameter gamma in rbf kernel 
    -s float -> parameter s in sigmoid/poly kernel 
    -r float -> parameter c in sigmoid/poly kernel 
    -u string -> parameter of user defined kernel

來源

2015-02-06 Bit Manipulator

這被稱爲grid search。我不知道你是否熟悉python和scikit-learn，但無論哪種方式，我認爲their description and examples非常好，並且語言不可知。

基本上，您可以爲每個參數指定一些您感興趣的值（或隨機抽樣的時間間隔，請參閱隨機搜索），然後針對每個設置組合使用交叉驗證（通常爲k fold cross validation）來計算模型對這些設置的效果。返回性能最好的組合（scikit-learn實際上可以返回組合的排名）。

請注意，這可能需要很長時間。根據您的問題，您應該自己確定一些參數。例如，對於文本分類，您應該選擇線性內核，以解決您可能需要的其他問題rbf等。不要只是將所有內容都放在網格搜索中，決定儘可能多的參數，因爲您可以使用自己的知識算法和手頭的問題。

來源

2015-02-06 23:22:53 IVlad

謝謝@ | V | ad。你能否給出澄清「對於文本分類，你應該選擇線性內核」？ – 2015-02-07 06:16:32

@BitManipulator - 我的意思是，在文獻中衆所周知，對於文本分類，這些實例在由一包詞語模型產生的高維度中（幾乎）可線性分離。所以線性內核表現最好，其他人嘗試沒有意義。不嘗試多個內核意味着您只需調整一個參數即可，爲您節省大量時間。 – IVlad 2015-02-07 10:00:45

如何調整SVM等級的參數？

回答

相關問題