生成一個文本文件,我有一個包含該表中的文本文件:熊貓閱讀從dataframe.to_string
Ion TheoWavelength Blended_Set
Line_Label
H1_4340A Hgamma_5_2 4340.471 None
He1_4472A HeI_4471 4471.479 None
He2_4686A HeII_4686 4685.710 None
Ar4_4711A [ArIV] 4711.000 None
Ar4_4740A [ArIV] 4740.000 None
H1_4861A Hbeta_4_2 4862.683 None
該表已經從熊貓數據框中使用dataframe.to_string然後保存unicode的變量生成。
我想用大熊貓函數來創建這個文件中的數據幀:
import pandas as pd
df = pd.read_csv('my_table_file.txt', delim_whitespace = True, header = 0, index_col = 0)
但是我得到這個錯誤
Traceback (most recent call last):
File
df = pd.read_csv(table, delim_whitespace = True, header = 0, index_col = 0)
File "/home/user/anaconda/python2/lib/python2.7/site-packages/pandas/io/parsers.py", line 562, in parser_f
return _read(filepath_or_buffer, kwds)
File "/home/user/anaconda/python2/lib/python2.7/site-packages/pandas/io/parsers.py", line 325, in _read
return parser.read()
File "/home/user/anaconda/python2/lib/python2.7/site-packages/pandas/io/parsers.py", line 815, in read
ret = self._engine.read(nrows)
File "/home/user/anaconda/python2/lib/python2.7/site-packages/pandas/io/parsers.py", line 1314, in read
data = self._reader.read(nrows)
File "pandas/parser.pyx", line 805, in pandas.parser.TextReader.read (pandas/parser.c:8748)
File "pandas/parser.pyx", line 827, in pandas.parser.TextReader._read_low_memory (pandas/parser.c:9003)
File "pandas/parser.pyx", line 881, in pandas.parser.TextReader._read_rows (pandas/parser.c:9731)
File "pandas/parser.pyx", line 868, in pandas.parser.TextReader._tokenize_rows (pandas/parser.c:9602)
File "pandas/parser.pyx", line 1865, in pandas.parser.raise_parser_error (pandas/parser.c:23325)
pandas.io.common.CParserError: Error tokenizing data. C error: Expected 3 fields in line 3, saw 4
我敢說,這是造成由於索引中的列名名稱在自己的行中。
無論如何避免這個問題或不包括此標籤導出表?
P.S.我試圖使用dataframe.to_csv表,但據我所知,它不允許你玩表格列格式,如果他們有不同的dtype
非常感謝您的回覆。這是非常有趣的SQL功能,它很好地工作...但是,對於這種情況下,它必須是一個文本文件。我設法使它工作,在「read_csv」中添加任何以「L」開頭的行(這不是此數據中的問題)中的註釋。我試圖使用ignore_rows,但它不起作用,如果你設置列索引...這很奇怪... – Delosari