0
我有以下格式的CSV,轉換爲多列由規格化列
print rfd.iloc[:5,:5]
Sub-division January 2010 Actual January 2010 Normal January 2011 Actual February 2010 Actual
0 Andaman and Nicobar Islands 98.2 53.7 222.5 5.8
1 Arunachal Pradesh 0.4 50.1 37.6 10.0
2 Assam and Meghalaya 0.2 16.4 9.0 3.4
3 Nagaland,Manipur, Mizoram, and Tripura 0.9 13.7 7.9 10.9
4 Sub-Himalayan,West Bengal & Sikkim 1.7 26.6 7.1 6.4
如何將其轉換爲多列。第一級將是年,然後是月和類型。
rfd.columns
Out[89]:
Index([u'Sub-division ', u'January 2010 Actual ', u'January 2010 Normal ',
u'January 2011 Actual ', u'February 2010 Actual ',
....
u'December 2010 Normal ', u' December 2011 Actual '],
dtype='object')
我想是這樣的rfd.columns = rfd.columns.str.split(" ")
然後數據幀成爲TypeError: unhashable type: 'list'
。如果它只是一個文件,我可以在csv和加載中更新它,但它是可重複的過程,所以尋找一些我可以迭代文件的解決方案。
添加兩排字典,
{'April 2010 Normal': {0: 81.5, 1: 278.80000000000001},
'April 2010 Actual': {0: 12.699999999999999, 1: 245.80000000000001},
'April 2011 Actual': {0: 83.700000000000003, 1: 114.7},
'August 2010 Actual': {0: 550.0, 1: 343.30000000000001},
'August 2010 Normal': {0: 403.80000000000001, 1: 359.89999999999998},
'August 2011 Actual': {0: 513.0, 1: 225.80000000000001},
'December 2010 Normal': {0: 145.5, 1: 38.399999999999999},
'December 2010 Actual': {0: 254.40000000000001, 1: 6.0},
'December 2011 Actual': {0: 246.30000000000001, 1: 10.300000000000001},
'February 2010 Actual': {0: 5.7999999999999998, 1: 10.0},
'February 2010 Normal': {0: 29.199999999999999, 1: 98.0},
'February 2011 Actual': {0: 81.900000000000006, 1: 36.799999999999997},
'January 2010 Normal': {0: 53.700000000000003, 1: 50.100000000000001},
'January 2010 Actual': {0: 98.200000000000003, 1: 0.40000000000000002},
'January 2011 Actual': {0: 222.5, 1: 37.600000000000001},
'July 2010 Normal': {0: 407.69999999999999, 1: 536.10000000000002},
'July 2010 Actual': {0: 522.10000000000002, 1: 426.0},
'July 2011 Actual': {0: 575.79999999999995, 1: 553.5},
'June 2010 Normal': {0: 438.60000000000002, 1: 500.39999999999998},
'June 2011 Actual': {0: 418.39999999999998, 1: 336.80000000000001},
'June 2010 Actual': {0: 435.0, 1: 397.30000000000001},
'March 2010 Normal': {0: 25.0, 1: 179.69999999999999},
'March 2010 Normal': {0: 20.5, 1: 164.40000000000001},
'March 2011 Actual': {0: 305.5, 1: 121.5},
'March 2010 Actual': {0: 0.40000000000000002, 1: 143.59999999999999},
'May 2010 Actual': {0: 310.69999999999999, 1: 273.80000000000001},
'May 2010 Normal': {0: 358.5, 1: 291.89999999999998},
'May 2011 Actual': {0: 305.69999999999999, 1: 157.80000000000001},
'November 2010 Normal': {0: 253.69999999999999, 1: 45.799999999999997},
'November 2010 Actual': {0: 281.39999999999998, 1: 59.700000000000003},
'November 2011 Actual': {0: 126.0, 1: 19.800000000000001},
'October 2010 Actual': {0: 415.19999999999999, 1: 84.400000000000006},
'October 2010 Normal': {0: 296.69999999999999, 1: 183.0},
'October 2011 Actual': {0: 183.80000000000001, 1: 46.799999999999997},
'September 2010 Normal': {0: 432.39999999999998, 1: 371.60000000000002},
'September 2010 Actual': {0: 261.30000000000001, 1: 407.39999999999998},
'September 2011 Actual': {0: 770.89999999999998, 1: 262.0},
'Sub-division': {0: 'Andaman and Nicobar Islands ', 1: 'Arunachal Pradesh'},
'october 2010 Normal': {0: 297.80000000000001, 1: 159.09999999999999}}
謝謝,對於這篇文章。我並不是在尋找有效的方法。我得到這個錯誤:ValueError:索引包含重複條目,無法重塑'f = f.unstack(['Year','Month','Level'])''。 +1直到此時爲止 – WoodChopper
適合我。我使用熊貓0.17.0也許不同的版本?我唯一的想法是,最初的數據框與你的不同。無論如何,最後的線條純粹是化妝品。 – luismf
我的'0.16.2'可能是。我使用了與dict相同的數據。 'a = pandas.DataFrame(copypaste)' – WoodChopper