2016-11-23 36 views
7

我正在做一個pandas DataFrame,我想保留第一行,但它不斷被轉換爲列名,我試過headers=False,但這只是完全刪除它。如何阻止大熊貓進入列名第一行

我有一個字符串(st = '\n'.join(lst)),我轉換爲一個類文件對象(io.StringIO(st)),然後生成該文件對象csv

回答

11

你想header=NoneFalse被鍵入晉升爲int0看到docs重點煤礦:

header : int or list of ints, default ‘infer’ Row number(s) to use as the column names, and the start of the data. Default behavior is as if set to 0 if no names passed, otherwise None. Explicitly pass header=0 to be able to replace existing names. The header can be a list of integers that specify row locations for a multi-index on the columns e.g. [0,1,3]. Intervening rows that are not specified will be skipped (e.g. 2 in this example is skipped). Note that this parameter ignores commented lines and empty lines if skip_blank_lines=True, so header=0 denotes the first line of data rather than the first line of the file.

你可以看到在行爲上的差異,先用header=0

In [95]: 
import io 
import pandas as pd 
t="""a,b,c 
0,1,2 
3,4,5""" 
pd.read_csv(io.StringIO(t), header=0) 

Out[95]: 
    a b c 
0 0 1 2 
1 3 4 5 

現在用None

In [96]: 
pd.read_csv(io.StringIO(t), header=None) 

Out[96]: 
    0 1 2 
0 a b c 
1 0 1 2 
2 3 4 5 

注意,在最新版本0.19.1,這將提高現在一個TypeError

In [98]: 
pd.read_csv(io.StringIO(t), header=False) 

TypeError: Passing a bool to header is invalid. Use header=None for no header or header=int or list-like of ints to specify the row(s) making up the column names

5

我想你需要參數header=Noneread_csv

樣品:

import pandas as pd 
from pandas.compat import StringIO 

temp=u"""a,b 
2,1 
1,1""" 

df = pd.read_csv(StringIO(temp),header=None) 
print (df) 
    0 1 
0 a b 
1 2 1 
2 1 1