2016-10-14 114 views
1

我運行一個分組數據幀迴歸,像這樣:將結果保存到數據幀

import pandas as pd 
from pandas.stats.api import ols 

df=pd.read_csv(r'C:\path_to_file.csv') #path to original file 

#groupby POINTID 
list1=[] 
for i, grp in df.groupby('POINTID'): 
    result = ols(y=grp['Date'], x=grp['SWIR32']) #run regression 
    #turn regression paramaters to a dataframe 
    frame=pd.DataFrame({'POINTID':i, 'R2': result.r2, 'pvalue': result.p_value[1], 'rmse': result.rmse}) 
    list1.append(frame) 
final_frame=pd.concat(list1) 

但這返回:

ValueError: If using all scalar values, you must pass an index

當我改變數據幀創建線對此:

frame=pd.DataFrame({'R2': result.r2, 'pvalue': result.p_value[1] , 'rmse': result.rmse}, index=i) 

返回:

TypeError: len() of unsized object 

基本上我只想POINTID,r2,RMSE和p值保存到一個數據幀。

回答

1

使用pd.Series代替

import pandas as pd 
from pandas.stats.api import ols 

df=pd.read_csv(r'C:\path_to_file.csv') #path to original file 

#groupby POINTID 
list1=[] 
for i, grp in df.groupby('POINTID'): 
    result = ols(y=grp['Date'], x=grp['SWIR32']) #run regression 
    #turn regression paramaters to a dataframe 
    frame=pd.Series({'POINTID':i, 'R2': result.r2, 'pvalue': result.p_value[1], 'rmse': result.rmse}) 
    list1.append(frame) 
final_frame=pd.concat(list1, axis=1).T