我的代碼如下UnicodeEncodeError使用DecisionTree
# -*- coding: utf-8 -*-
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn import tree
Model_Dev_Val = pd.read_excel("data2.xlsx")
target = Model_Dev_Val[['source_2']]
model_train, model_test, y_train, y_test = train_test_split(Model_Dev_Val, target,test_size = 0.5, random_state = 40,stratify = target)
clf = tree.DecisionTreeClassifier()
clf = clf.fit(model_train,y_train)
但它拋出一個錯誤:
UnicodeEncodeError: 'decimal' codec can't encode characters in position 0-2: invalid decimal Unicode string
data2.xlsx include some Chinese, and the data has been cleaned.
可能會有文件中的中文字符出現問題。 – PinkFluffyUnicorn
我想過了。我從老闆那裏獲取正確的data.xlsx。並且它錯誤:ValueError:輸入包含NaN,無窮大或者對於dtype('float32')來說值太大。 –
然後在那裏可能有一個'NaN','infinity'或者太大的數字 – PinkFluffyUnicorn