2016-04-23 119 views
3

我讀.csv文件到一個數據幀(CorpActionsDf),但是當我打印CorpActionsDf的腦子裏,我看到我丟失的數據的第一行:缺少第一行從CSV

頭將該.cvs數據:

CorpActionsDf的
BBG.XAMS.ASML.S 24/04/2015 0.7 Annual Regular Cash 
BBG.XAMS.ASML.S 25/04/2014 0.61 Annual Regular Cash 
BBG.XAMS.ASML.S 26/04/2013 0.53 Annual Regular Cash 
BBG.XAMS.ASML.S 26/11/2012 9.18 None Return of Capital 
BBG.XAMS.ASML.S 27/04/2012 0.46 Annual Regular Cash 

頭:

     date factor_value reference    factor 
unique_id                
BBG.XAMS.ASML.S 25/04/2014   0.61 Annual  Regular Cash 
BBG.XAMS.ASML.S 26/04/2013   0.53 Annual  Regular Cash 
BBG.XAMS.ASML.S 26/11/2012   9.18  None Return of Capital 
BBG.XAMS.ASML.S 27/04/2012   0.46 Annual  Regular Cash 
BBG.XAMS.ASML.S 26/04/2011   0.40 Annual  Regular Cash 

正如你所看到的數據在CSV的第一行是從數據幀丟失。

BBG.XAMS.ASML.S 24/04/2015 0.7 Annual Regular Cash 

我的代碼如下:

def getCorpActionsData(rawStaticDataPath): 
    pattern = 'CorporateActions'+ '.csv' 
    staticPath = rawStaticDataPath 

    with open(staticPath+pattern,'rt') as f: 

     #staticDf=pd.read_csv(f,engine='c',header=0,index_col=0, parse_dates=True, infer_datetime_format=True,usecols=(0,3)) 
     CorpActionsDf=pd.read_csv(f,engine='c',header=0,index_col=0, parse_dates=True, infer_datetime_format=True,names=['unique_id', 'date','factor_value','reference','factor'])   
     print('CorpActionsDf') 
     print(CorpActionsDf.head()) 

任何一個有一個想法,我缺少的是什麼?

感謝

回答

1

你試過標題=無不是頭= 0?

的Docu說的頭= 0:

「缺省行爲,如果設置爲0,如果沒有名字通過,否則沒有明確地傳遞標題= 0到能夠取代現有名稱。」

CorpActionsDf=pd.read_csv(f,engine='c',header=None,index_col=0, parse_dates=True, infer_datetime_format=True,names=['unique_id', 'date','factor_value','reference','factor']) 
2

你必須使用None,而不是0header參數。否則,請告訴代碼將第0行視爲包含標題的行,並且僅在後面用names參數替換它們。

CorpActionsDf=pd.read_csv(f,engine='c',header=None,index_col=0, parse_dates=True, infer_datetime_format=True,names=['unique_id', 'date','factor_value','reference','factor'])   
0

我不確定您是否正確使用參數。我不知道熊貓,因爲我使用Numpy,但如果我看起來Pandas Documentation,我認爲頭和名稱參數不好。

header = 0替換現有名稱,因此您應該編寫header = None

CorpActionsDf=pd.read_csv(f,engine='c',header=None,index_col=0, parse_dates=True, infer_datetime_format=True,names=['unique_id', 'date','factor_value','reference','factor']) 

試着說我是否更好?否則,你可以使用Numpy,我可以幫你!