2013-12-17 72 views
11

我下載了一個google-spreadsheet作爲python中的對象。使用openpyxl從內存中讀取文件

如何使用openpyxl使用工作簿而不先將其保存到磁盤?

我知道xlrd可以做到這一點:

book = xlrd.open_workbook(file_contents=downloaded_spreadsheet.read()) 

與「downloaded_spreadsheet」是我下載XLSX文件爲對象。

而不是xlrd,我想使用openpyxl,因爲更好的xlsx-support(我讀過)。

我使用這個迄今爲止...

#!/usr/bin/python 

    import openpyxl 
    import xlrd 
    # which to use..? 


import re, urllib, urllib2 

class Spreadsheet(object): 
    def __init__(self, key): 
     super(Spreadsheet, self).__init__() 
     self.key = key 

class Client(object): 
    def __init__(self, email, password): 
     super(Client, self).__init__() 
     self.email = email 
     self.password = password 

    def _get_auth_token(self, email, password, source, service): 
     url = "https://www.google.com/accounts/ClientLogin" 
     params = { 
     "Email": email, "Passwd": password, 
     "service": service, 
     "accountType": "HOSTED_OR_GOOGLE", 
     "source": source 
     } 
     req = urllib2.Request(url, urllib.urlencode(params)) 
     return re.findall(r"Auth=(.*)", urllib2.urlopen(req).read())[0] 

    def get_auth_token(self): 
     source = type(self).__name__ 
     return self._get_auth_token(self.email, self.password, source, service="wise") 

    def download(self, spreadsheet, gid=0, format="xls"): 

     url_format = "https://spreadsheets.google.com/feeds/download/spreadsheets/Export?key=%s&exportFormat=%s&gid=%i" 
     headers = { 
     "Authorization": "GoogleLogin auth=" + self.get_auth_token(), 
     "GData-Version": "3.0" 
     } 
     req = urllib2.Request(url_format % (spreadsheet.key, format, gid), headers=headers) 
     return urllib2.urlopen(req) 

if __name__ == "__main__": 



    email = "[email protected]" # (your email here) 
    password = '.....' 
    spreadsheet_id = "......" # (spreadsheet id here) 

    # Create client and spreadsheet objects 
    gs = Client(email, password) 
    ss = Spreadsheet(spreadsheet_id) 

    # Request a file-like object containing the spreadsheet's contents 
    downloaded_spreadsheet = gs.download(ss) 


    # book = xlrd.open_workbook(file_contents=downloaded_spreadsheet.read(), formatting_info=True) 

    #It works.. alas xlrd doesn't support the xlsx-funcionality that i want... 
    #i.e. being able to read the cell-colordata.. 

我希望有人可以幫助,因爲我掙扎了幾個月獲得在谷歌電子表格的顏色數據從給定的細胞。 (我知道谷歌的API不支持它。)

回答

21

在文檔的load_workbook它說:

#:param filename: the path to open or a file-like object 

..所以它是有能力的這一切的時候。它讀取路徑或採用類似文件的對象。 我只有把由urlopen我的類文件對象返回到bytestream有:

from io import BytesIO 
wb = load_workbook(filename=BytesIO(input_excel.read())) 

,我可以閱讀我的谷歌電子表格的每一塊數據。

+0

+1 - 犯了類似的錯誤。我只讀了前半部分,並認爲它只能讀取文件。現在我回頭看完它,看到它也可以做類似文件的對象。 –

1

其實夠到:

file = open('path/to/file.xlsx', 'rb') 
wb = openpyxl.load_workbook(filename=file) 

,它會工作。不需要BytesIO和東西。

+1

正如問題所示,它不是從文件系統讀取的。這是一個流。 –

相關問題