0
以下代碼給出了以下錯誤:ValueError:找到包含0個樣本(形狀=(0,3))的數組,而最小值爲1是必需的。在Scikit中,如何在預測時修復數值錯誤?
該錯誤在調用預測的地方產生。我假設數據框的形狀有些問題,'obs_to_pred'。我檢查了形狀,這是(1046,3)。
你有什麼建議,所以我可以解決這個問題並運行預測?
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import statsmodels.api as sm
from patsy import dmatrices
from sklearn.linear_model import LogisticRegression
import scipy.stats as stats
from sklearn import linear_model
# Import Titanic Data
train_loc = 'C:/Users/Young/Desktop/Kaggle/Titanic/train.csv'
test_loc = 'C:/Users/Young/Desktop/Kaggle/Titanic/test.csv'
train = pd.read_csv(train_loc)
test = pd.read_csv(test_loc)
# Predict Missing Age Values Based on Factors Pclass, SibSp, and Parch.
# In the function, combine train and test data.
def regressionPred (traindata,testdata):
allobs = pd.concat([traindata, testdata])
allobs = allobs[~allobs.Age.isnull()]
y = allobs.Age
y, X = dmatrices('y ~ Pclass + SibSp + Parch', data = allobs, return_type = 'dataframe')
mod = sm.OLS(y,X)
res = mod.fit()
predictors = ['Pclass', 'SibSp', 'Parch']
regr = linear_model.LinearRegression()
regr.fit(allobs.ix[:,predictors], y)
obs_to_pred = allobs[allobs.Age.isnull()].ix[:,predictors]
prediction = regr.predict(obs_to_pred) # Error Produced in This Line ***
return res.summary(), prediction
regressionPred(train,test)
萬一你可能想看看數據集,鏈接將帶你去:https://www.kaggle.com/c/titanic/data