2016-01-13 80 views
0

我正在學習如何在Python中使用決策樹。我修改了一個例子來導入CSV文件,而不是使用虹膜數據集從該網站:Python分類和迴歸樹錯誤

http://machinelearningmastery.com/get-your-hands-dirty-with-scikit-learn-now/

代碼:

import numpy as np 
import urllib 
from sklearn.tree import DecisionTreeClassifier 
from sklearn import tree 
from sklearn import datasets 
from sklearn import metrics 

# URL for the Pima Indians Diabetes dataset (UCI Machine Learning Repository) 
url = "http://goo.gl/j0Rvxq" 
# download the file 
raw_data = urllib.urlopen(url) 
# load the CSV file as a numpy matrix 
dataset = np.loadtxt(raw_data, delimiter=",") 
#print(dataset.shape) 
# separate the data from the target attributes 
X = dataset[:,0:7] 
y = dataset[:,8] 
# fit a CART model to the data 
model = DecisionTreeClassifier() 
model.fit(dataset.data, dataset.target) 
print model 

錯誤:

Traceback (most recent call last): 
    File "DatasetTest2.py", line 24, in <module> 
    model.fit(dataset.data, dataset.target) 
AttributeError: 'numpy.ndarray' object has no attribute 'target' 

我不知道爲什麼會出現這個錯誤。如果我使用示例中的虹膜數據集,那麼它工作得很好。最終,我需要能夠在csv文件上執行決策樹。

我也試着以下代碼也導致同樣的錯誤:

# Import Python Modules 
from sklearn.tree import DecisionTreeClassifier 
from sklearn import tree 
from sklearn import datasets 
from sklearn import metrics 
import pandas as pd 
import numpy as np 

#Import Data 
raw_data = pd.read_csv("DataTest1.csv") 
dataset = raw_data.as_matrix() 
#print dataset.shape 
#print dataset 
# separate the data from the target attributes 
X = dataset[:,[2,3,4,7,10]] 
y = dataset[:,[1]] 
#print X 
# fit a CART model to the data 
model = DecisionTreeClassifier() 
model.fit(dataset.data, dataset.target) 
print model 

回答

0

,其設置在例如導入的dataset對象不是數據的一個普通的表。這是一個特殊的對象,使用datatarget等屬性進行設置,以便可以按照示例中所示使用它。如果你有自己的數據,你需要決定使用什麼作爲數據和目標。從你的例子看來,你想要做model.fit(X, y)