2016-06-14 429 views
0

我試圖找到總功率從多種因素,例如溫度,溼度等的依賴性並且具有以下的代碼:的Python:多元線性迴歸:statsmodels.formula.api.ols()

from functools import reduce 
dfs=[df1,df2,df4,df7] 
df_final = reduce(lambda left,right:pd.merge(left,right,left_index=True,right_index=True), dfs) 
df_final=df_final.drop(["0_x","0_y",0,4],1) 
df_final.columns=["OT","HP","H","TP"] 


# df_final.shape output is (8790, 4) 
import statsmodels.formula.api as smf 
lm = smf.ols(formula='TP ~ OT+HP+H',data=df_final).fit() 
lm.summary() 

輸出:

ValueError        Traceback (most recent call last) 
<ipython-input-45-c09782ec7959> in <module>() 
    3 lm = smf.ols(formula='TP ~ OT+HP+H',data=df_final).fit() 
    4 
----> 5 lm.summary() 

C:\Anaconda3\lib\site-packages\statsmodels\regression\linear_model.py in summary(self, yname, xname, title, alpha) 
1948    top_left.append(('Covariance Type:', [self.cov_type])) 
1949 
-> 1950   top_right = [('R-squared:', ["%#8.3f" % self.rsquared]), 
1951      ('Adj. R-squared:', ["%#8.3f" % self.rsquared_adj]), 
1952      ('F-statistic:', ["%#8.4g" % self.fvalue]), 

C:\Anaconda3\lib\site-packages\statsmodels\tools\decorators.py in __get__(self, obj, type) 
92   if _cachedval is None: 
93    # Call the "fget" function 
---> 94    _cachedval = self.fget(obj) 
95    # Set the attribute in obj 
96 #   print("Setting %s in cache to %s" % (name, _cachedval)) 

C:\Anaconda3\lib\site-packages\statsmodels\regression\linear_model.py in rsquared(self) 
1179  def rsquared(self): 
1180   if self.k_constant: 
-> 1181    return 1 - self.ssr/self.centered_tss 
1182   else: 
1183    return 1 - self.ssr/self.uncentered_tss 

C:\Anaconda3\lib\site-packages\statsmodels\tools\decorators.py in __get__(self, obj, type) 
92   if _cachedval is None: 
93    # Call the "fget" function 
---> 94    _cachedval = self.fget(obj) 
95    # Set the attribute in obj 
96 #   print("Setting %s in cache to %s" % (name, _cachedval)) 

C:\Anaconda3\lib\site-packages\statsmodels\regression\linear_model.py in ssr(self) 
1151  def ssr(self): 
1152   wresid = self.wresid 
-> 1153   return np.dot(wresid, wresid) 
1154 
1155  @cache_readonly 

ValueError: shapes (8790,4294) and (8790,4294) not aligned: 4294 (dim 1) != 8790 (dim 0) 

我不知道爲什麼我在這裏得到的形狀不匹配。我甚至用較小的數據集嘗試過它,但仍然出現類似的錯誤。感謝您閱讀。任何有關如何有效共享我的ipython筆記本的意見也會有所幫助。

回答

0

我的一個數據列是string而不是float,因此拋出了這個錯誤。