0
我試圖打開一個數據幀,插入熊貓進行一些分析。數據幀錯誤讀取csv
raw = pd.read_csv('/home/chris/Desktop/Cambridge/SOURCE_DATA/Node_56_Nairobi_OutputFile.xls', encoding='utf16', error_bad_lines=False)
我在其他線程上嘗試了一些建議。然後發生這種情況:
Skipping line 3: expected 1 fields, saw 20
Skipping line 21: expected 1 fields, saw 2
Skipping line 22: expected 1 fields, saw 6
Skipping line 23: expected 1 fields, saw 3
Skipping line 27: expected 1 fields, saw 2
Skipping line 28: expected 1 fields, saw 2
Skipping line 30: expected 1 fields, saw 2
Skipping line 34: expected 1 fields, saw 2
Skipping line 35: expected 1 fields, saw 2
Skipping line 36: expected 1 fields, saw 2
Skipping line 37: expected 1 fields, saw 2
Skipping line 38: expected 1 fields, saw 2
Skipping line 39: expected 1 fields, saw 2
Skipping line 40: expected 1 fields, saw 2
Skipping line 111: expected 1 fields, saw 2
Skipping line 113: expected 1 fields, saw 2
Skipping line 116: expected 1 fields, saw 2
Skipping line 117: expected 1 fields, saw 2
Skipping line 161: expected 1 fields, saw 2
Skipping line 162: expected 1 fields, saw 2
Skipping line 182: expected 1 fields, saw 2
Skipping line 184: expected 1 fields, saw 3
Skipping line 202: expected 1 fields, saw 2
Skipping line 204: expected 1 fields, saw 2
Skipping line 218: expected 1 fields, saw 3
Skipping line 222: expected 1 fields, saw 2
Skipping line 223: expected 1 fields, saw 2
Skipping line 232: expected 1 fields, saw 5
Skipping line 233: expected 1 fields, saw 2
Skipping line 234: expected 1 fields, saw 2
Skipping line 235: expected 1 fields, saw 3
Skipping line 237: expected 1 fields, saw 2
Skipping line 259: expected 1 fields, saw 4
Skipping line 265: expected 1 fields, saw 3
Skipping line 275: expected 1 fields, saw 2
Skipping line 290: expected 1 fields, saw 2
Skipping line 294: expected 1 fields, saw 2
Skipping line 301: expected 1 fields, saw 2
Skipping line 303: expected 1 fields, saw 3
Skipping line 307: expected 1 fields, saw 3
Skipping line 323: expected 1 fields, saw 2
Skipping line 326: expected 1 fields, saw 3
Skipping line 332: expected 1 fields, saw 2
Skipping line 334: expected 1 fields, saw 2
Skipping line 340: expected 1 fields, saw 4
Skipping line 345: expected 1 fields, saw 4
Skipping line 349: expected 1 fields, saw 2
Skipping line 351: expected 1 fields, saw 2
Skipping line 361: expected 1 fields, saw 2
Skipping line 370: expected 1 fields, saw 2
它會繼續。爲什麼? 比它最終仍然拋出了這個錯誤
CParserError
Traceback (most recent call last)
<ipython-input-21-ab444ae5f5e9> in <module>()
----> 1 raw = pd.read_csv('/home/chris/Desktop/Cambridge/SOURCE_DATA/Node_56_Nairobi_OutputFile.xls', encoding='utf16', error_bad_lines=False)
/home/chris/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.pyc in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skip_footer, doublequote, delim_whitespace, as_recarray, compact_ints, use_unsigned, low_memory, buffer_lines, memory_map, float_precision)
527 skip_blank_lines=skip_blank_lines)
528
--> 529 return _read(filepath_or_buffer, kwds)
530
531 parser_f.__name__ = name
/home/chris/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.pyc in _read(filepath_or_buffer, kwds)
303 return parser
304
--> 305 return parser.read()
306
307 _parser_defaults = {
/home/chris/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.pyc in read(self, nrows)
761 raise ValueError('skip_footer not supported for iteration')
762
--> 763 ret = self._engine.read(nrows)
764
765 if self.options.get('as_recarray'):
/home/chris/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.pyc in read(self, nrows)
1211 def read(self, nrows=None):
1212 try:
-> 1213 data = self._reader.read(nrows)
1214 except StopIteration:
1215 if self._first_chunk:
pandas/parser.pyx in pandas.parser.TextReader.read (pandas/parser.c:7988)()
pandas/parser.pyx in pandas.parser.TextReader._read_low_memory (pandas/parser.c:8244)()
pandas/parser.pyx in pandas.parser.TextReader._read_rows (pandas/parser.c:8970)()
pandas/parser.pyx in pandas.parser.TextReader._tokenize_rows (pandas/parser.c:8838)()
pandas/parser.pyx in pandas.parser.raise_parser_error (pandas/parser.c:22649)()
CParserError: Error tokenizing data. C error: Buffer overflow caught - possible malformed input file.
我真的不知道爲什麼,雖然更多。
你能上傳CSV文件的樣本?可能是CSV方案不正確:分隔符和行尾字符不一致。 –
另外 - 是否有使用utf-16的具體原因? –
當然,最好的方法是什麼?而不是真的,我嘗試了其他類似結果編碼 –