2015-12-17 86 views
2

下面的代碼給出錯誤信息:應用sklearn功能大熊貓數據幀給ValueError異常(「未知標籤類型:%R」%Y)

>>> import pandas as pd 
    >>> from sklearn import preprocessing, svm 
    >>> df = pd.DataFrame({"a": [0,1,2], "b":[0,1,2], "c": [0,1,2]}) 
    >>> clf = svm.SVC() 
    >>> df = df.apply(lambda x: preprocessing.scale(x)) 
    >>> clf.fit(df[["a", "b"]], df["c"]) 
    Traceback (most recent call last): 
     File "<stdin>", line 1, in <module> 
     File "C:\Users\Alexander\Anaconda\lib\site-packages\sklearn\svm\base.py", lin 
    151, in fit 
     y = self._validate_targets(y) 
     File "C:\Users\Alexander\Anaconda\lib\site-packages\sklearn\svm\base.py", lin 
    515, in _validate_targets 
     check_classification_targets(y) 
     File "C:\Users\Alexander\Anaconda\lib\site-packages\sklearn\utils\multiclass. 
    y", line 173, in check_classification_targets 
     raise ValueError("Unknown label type: %r" % y) 
    ValueError: Unknown label type: 0 -1.224745 
    1 0.000000 
    2 1.224745 
    Name: c, dtype: float64 

大熊貓數據幀的D型細胞不是一個對象,所以應用sklearn svm函數應該沒問題,但由於某種原因它不能識別分類標籤。什麼導致這個問題?

+0

嘗試'DF [ 「A」, 「B」]] values'和'DF [ 「C」] values' SKLearn通常預計的陣列,不是數據框。 –

+0

同樣的問題,錯誤信息是: – Alex

+0

raise ValueError(「Unknown label type:%r」%y) ValueError:Unknown label type:array([ - 1.22474487,0.,1.22474487]) – Alex

回答

4

問題是,在縮放步驟之後,標籤是浮點值,這不是有效的標籤類型;如果您轉換爲intstr它應該工作:。

In [32]: clf.fit(df[["a", "b"]], df["c"].astype(int)) 
Out[32]: 
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, 
    decision_function_shape=None, degree=3, gamma='auto', kernel='rbf', 
    max_iter=-1, probability=False, random_state=None, shrinking=True, 
    tol=0.001, verbose=False)