這是一個相當長的問題,我可能會錯過某些內容,所以如果需要更多信息,請提問。使用google-docs-api排序錯誤
我一直在使用scaperwiki從谷歌學者那裏搜索數據,直到最近我只是把所有的網址都放在這裏。
elec_urls = """http://1.hidemyass.com/ip-5/encoded/Oi8vc2Nob2xhci5nb29nbGUuY29tL2NpdGF0aW9ucz91c2VyPWo0YnRpeXNBQUFBSiZobD1lbg%3D%3D&f=norefer
http://4.hidemyass.com/ip-1/encoded/Oi8vc2Nob2xhci5nb29nbGUuY29tL2NpdGF0aW9ucz91c2VyPVZXaFJiZEFBQUFBSiZobD1lbg%3D%3D&f=norefer
http://4.hidemyass.com/ip-2/encoded/Oi8vc2Nob2xhci5nb29nbGUuY29tL2NpdGF0aW9ucz91c2VyPV84X09JSWNBQUFBSiZobD1lbg%3D%3D&f=norefer
http://1.hidemyass.com/ip-4/encoded/Oi8vc2Nob2xhci5nb29nbGUuY29tL2NpdGF0aW9ucz91c2VyPUh3WHdmTGtBQUFBSiZobD1lbg%3D%3D&f=norefer
http://4.hidemyass.com/ip-1/encoded/Oi8vc2Nob2xhci5nb29nbGUuY29tL2NpdGF0aW9ucz91c2VyPXU1NWFWZEFBQUFBSiZobD1lbg%3D%3D&f=norefer
""".strip()
elec_urls = elec_urls.splitlines()
我再花葶每一頁,並把我想類型的字典列表中的信息,排序一次,刪除重複項,然後再使用不同的密鑰進行排序的話,我再導出我想要的信息到谷歌文檔電子表格。這工作100%。
我試圖改變它,以便我可以有另一個Google文檔電子表格,並且從這裏我可以放入所有的URL並且它會執行相同的操作。以下是我迄今爲止所做的。
def InputUrls(Entered_doc, EnteredURL):
username = 'myemail'
password = 'mypassword'
doc_name = Entered_doc
spreadsheet_id = Entered_doc
worksheet_id = 'od6'
# Connect to Google
gd_client = gdata.spreadsheet.service.SpreadsheetsService()
gd_client.email = username
gd_client.password = password
gd_client.source = EnteredURL
gd_client.ProgrammaticLogin()
#Now that we're connected, we query the spreadsheet by name, and extract the unique spreadsheet and worksheet IDs.
rows = gd_client.GetListFeed(spreadsheet_id, worksheet_id).entry
#At this point, you have a row iterator which will yield rows for the spreadsheet. This example will print everything out, keyed by column names:
urlslist = []
for row in rows:
for key in row.custom:
urlslist.append(row.custom[key].text)
return urlslist
def URLStoScrape(ToScrape):
Dep = []
for i in range(0,len(ToScrape)):
Department_urls = ToScrape[i].strip()
Department_urls = Department_urls.splitlines()
Done = MainScraper(Department_urls)
Dep.append(Done)
return Dep
ElectricalDoc = '0AkGb10ekJtfQdG9EOHN0VzRDdVhWaG1kNVEtdVpyRlE'
ElectricalUrl = 'https://docs.google.com/spreadsheet/ccc? '
ToScrape_Elec = InputUrls(ElectricalDoc, ElectricalUrl)
這似乎很好,但然後當程序去排序我得到下面的錯誤。
回溯(最近通話最後一個): 文件 「./code/scraper」,線路230,在 Total_and_Hindex_Electrical = GetTotalCitations(電氣) 文件 「./code/scraper」,行89,在GetTotalCitations Wrt_CitationURL = Sorting(Department,「CitationURL」) 文件「./code/scraper」,第15行,排序 SortedData = sorted(Unsorted,reverse = True,key = lambda k:k [pivot]) 文件「代碼/刮板「,第15行,在 SortedData = sorted(Unsorted,reverse = True,key = lambda k:k [pivot]) TypeError:列表索引必須是整數,而不是str
我認爲,幾乎可以肯定的是,它與URLStoScrape函數有關,但我不知道如何解決它,任何幫助都會很棒。
謝謝你讓我知道,如果需要
它看起來像樞軸變量是一個字符串,而不是一個整數。你可以發佈sorted()的代碼嗎? – raphonic
以下是排序功能的代碼。 def Sorting(Unsorted,pivot): SortedData = sorted(Unsorted,reverse = True,key = lambdak:k [pivot]) return SortedData – Totothejuggler