0
我試圖讓這個文本文件(philadelphia.txt)爲大熊貓數據幀不恆定:無法使數據幀,因爲read_csv空格分隔
STATION STATION_NAME DATE TAVG TMAX TMIN
----------------- -------------------------------------------------- -------- -------- -------- --------
GHCND:USW00094732 PHILADELPHIA NE PHILADELPHIA AIRPORT PA US 19970605 -9999 74 47
GHCND:USW00094732 PHILADELPHIA NE PHILADELPHIA AIRPORT PA US 19970606 -9999 68 50
GHCND:USW00094732 PHILADELPHIA NE PHILADELPHIA AIRPORT PA US 19970608 -9999 72 50
GHCND:USW00094732 PHILADELPHIA NE PHILADELPHIA AIRPORT PA US 19970609 -9999 83 47
GHCND:USW00094732 PHILADELPHIA NE PHILADELPHIA AIRPORT PA US 19970610 -9999 86 55
GHCND:USW00094732 PHILADELPHIA NE PHILADELPHIA AIRPORT PA US 19970611 -9999 88 61
GHCND:USW00094732 PHILADELPHIA NE PHILADELPHIA AIRPORT PA US 19970612 -9999 83 70
GHCND:USW00094732 PHILADELPHIA NE PHILADELPHIA AIRPORT PA US 19970613 -9999 80 66
GHCND:USW00094732 PHILADELPHIA NE PHILADELPHIA AIRPORT PA US 19970614 -9999 80 64
GHCND:USW00094732 PHILADELPHIA NE PHILADELPHIA AIRPORT PA US 19970615 -9999 77 55
GHCND:USW00094732 PHILADELPHIA NE PHILADELPHIA AIRPORT PA US 19970616 -9999 79 49
但是,如果我用
data = pd.read_csv('philadelphia.txt', sep="\s+", header=0)
它製作了一個正確的標題,但是卻遇到了分割電臺名稱數據的問題。我希望它包含在列名「STATION_NAME」下,但是sep =「\ s +」會將它拆分爲空格,並且出現錯誤。
pandas.errors.ParserError: Error tokenizing data. C error: Expected 6 fields in line 3, saw 11
如何將數據分成6列,而不需要將站名分成單獨的單詞?
我也希望能夠傳遞其他文本文件與不同的站名稱,如(yellowknife.txt)。
STATION STATION_NAME DATE TMAX TMIN
----------------- -------------------------------------------------- -------- -------- --------
GHCND:CA002204101 YELLOWKNIFE A CA 20130117 -21 -35
GHCND:CA002204101 YELLOWKNIFE A CA 20130118 -15 -21
GHCND:CA002204101 YELLOWKNIFE A CA 20130119 -17 -29
GHCND:CA002204101 YELLOWKNIFE A CA 20130120 -18 -28
GHCND:CA002204101 YELLOWKNIFE A CA 20130121 -21 -34
GHCND:CA002204101 YELLOWKNIFE A CA 20130122 -16 -30
GHCND:CA002204101 YELLOWKNIFE A CA 2013-17 -28
GHCND:CA002204101 YELLOWKNIFE A CA 20130124 -5 -17