2017-07-17 18 views
0

這是我的文件看起來是這樣的:無法把一個簡單的文本文件轉換成數據幀的大熊貓

raw_file - >

'Date\tValue\tSeries\tLabel\n07/01/2007\t687392\t31537611\tThis home\n08/01/2007\t750624\t31537611\tThis home\n09/01/2007\t769358\t31537611\tThis home\n10/01/2007\t802014\t31537611\tThis home\n11/01/2007\t815973\t31537611\tThis home\n12/01/2007\t806853\t31537611\tThis home\n01/01/2008\t836318\t31537611\tThis home\n02/01/2008\t856792\t31537611\tThis home\n03/01/2008\t854411\t31537611\tThis home\n04/01/2008\t826354\t31537611\tThis home\n05/01/2008\t789017\t31537611\tThis home\n06/01/2008\t754162\t31537611\tThis home\n07/01/2008\t749522\t31537611\tThis home\n08/01/2008\t757577\t31537611\tThis home\n' 

type(raw_file) - ><type 'str'>

出於某種原因,I can't use pd.read_csv(raw_file)或者我會得到錯誤:

File "pandas\_libs\parsers.pyx", line 710, in pandas._libs.parsers.TextReader._setup_parser_source (pandas\_libs\parsers.c:8873) 
IOError: File Date Value Series Label 
07/01/2007 687392 31537611 This home 
08/01/2007 750624 31537611 This home 
does not exist 

我能想到的最好的是:

for row in raw_file.split('\n'): 
    print(row.split('\t')) 

這很慢。有沒有更好的辦法?

回答

0

當你給熊貓一個string爲​​參數 - 它認爲它是一個文件名或一個URL。

docs

filepath_or_buffer : str , pathlib.Path , py._path.local.LocalPath or any object with a read() method (such as a file handle or StringIO)

The string could be a URL. Valid URL schemes include http, ftp, s3, and file. For file URLs, a host is expected. For instance, a local

file could be file ://localhost/path/to/table.csv

解決方案:使用io.StringIO()構造:

In [69]: pd.read_csv(io.StringIO(raw_file), delim_whitespace=True) 
Out[69]: 
       Date  Value Series Label 
07/01/2007 687392 31537611 This home 
08/01/2007 750624 31537611 This home 
09/01/2007 769358 31537611 This home 
10/01/2007 802014 31537611 This home 
11/01/2007 815973 31537611 This home 
12/01/2007 806853 31537611 This home 
01/01/2008 836318 31537611 This home 
02/01/2008 856792 31537611 This home 
03/01/2008 854411 31537611 This home 
04/01/2008 826354 31537611 This home 
05/01/2008 789017 31537611 This home 
06/01/2008 754162 31537611 This home 
07/01/2008 749522 31537611 This home 
08/01/2008 757577 31537611 This home