2013-03-23 85 views
10

看完閱讀url鏈接的不同方法後,指向一個.xls文件,我決定使用xlrd。將Excel工作簿從一張網址轉換爲一個`pandas.DataFrame`

我有困難的時候,一個「xlrd.book.Book」型轉變爲「pandas.DataFrame」

我有以下幾點:

import pandas 
import xlrd 
import urllib2 

link ='http://www.econ.yale.edu/~shiller/data/chapt26.xls' 
socket = urllib2.urlopen(link) 

#this line gets me the excel workbook 
xlfile = xlrd.open_workbook(file_contents = socket.read()) 

#storing the sheets 
sheets = xlfile.sheets() 

我想德最後一張紙的sheets和進口作爲pandas.DataFrame,有關我如何能夠完成此任何想法?我試過,pandas.ExcelFile.parse()但它想要一個Excel文件的路徑。我當然可以將文件保存到內存中,然後解析(使用tempfile或其他),但我試圖遵循Pythonic指南並使用已經寫入熊貓的功能可能

任何指導都一如既往地受到高度讚賞。

回答

23

你可以通過你的socketExcelFile

>>> import pandas as pd 
>>> import urllib2 
>>> link = 'http://www.econ.yale.edu/~shiller/data/chapt26.xls' 
>>> socket = urllib2.urlopen(link) 
>>> xd = pd.ExcelFile(socket) 
NOTE *** Ignoring non-worksheet data named u'PDVPlot' (type 0x02 = Chart) 
NOTE *** Ignoring non-worksheet data named u'ConsumptionPlot' (type 0x02 = Chart) 
>>> xd.sheet_names 
[u'Data', u'Consumption', u'Calculations'] 
>>> df = xd.parse(xd.sheet_names[-1], header=None) 
>>> df 
            0 1 2 3   4 
0  Average Real Interest Rate: NaN NaN NaN 1.028826 
1 Geometric Average Stock Return: NaN NaN NaN 0.065533 
2    exp(geo. Avg. return) NaN NaN NaN 0.067728 
3 Geometric Average Dividend Growth NaN NaN NaN 0.012025 
+1

完美,感謝這麼多的明確和及時的答覆。 – benjaminmgross 2013-03-23 16:07:26

0

你可以通過URL來pandas.read_excel()

import pandas as pd 

link ='http://www.econ.yale.edu/~shiller/data/chapt26.xls' 
data = pd.read_excel(link,'sheetname') 
相關問題