維基百科用維基百科1.4.0廢棄：如何跳過不好的結果？

我正在使用維基百科 for python 2.7，至報廢文章，使用來自非常大的數據集的文字。維基百科用維基百科1.4.0廢棄：如何跳過不好的結果？

下面的代碼：

for node_id in top_k: 
    human_string = label_lines[node_id] 
    score = predictions[0][node_id] 
    print('%s (score = %.5f)' % (human_string, score))  


    # Wiki = wikipedia.page(human_string) 
    # print (Wiki.content) 

    lista.append(human_string) 

for i in xrange(5): 
    wiki = wikipedia.page(lista[i]) 
    print (wiki.content) 
    a = wiki.content 
    #appendowanie = '%s (score = %.5f)' % (human_string, score) 
    # appendowanie = str(human_string) 
    appendFile = open('/home/inception/wikipedia.txt', 'a') 
    appendFile.write('\n\n'+str(i)) 
    appendFile.write(a.encode("utf-8")) 
    appendFile.close()

我想借此從列表中5個項目，搜索它在維基百科和報廢整個文章wikipedia.txt文件。有時維基百科搜索給我一個錯誤，因爲從列表中未登錄詞： 例如錯誤

Traceback (most recent call last): File "label_image.py", line 68, in <module> 
    wiki = wikipedia.page(lista[i]) File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 276, in page 
    return WikipediaPage(title, redirect=redirect, preload=preload) File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 299, in __init__ 
    self.__load(redirect=redirect, preload=preload) File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 345, in __load 
    raise PageError(self.title) wikipedia.exceptions.PageError: Page id "gracile crown blackbird" does not match any pages. Try another id!

竹葉冠黑鳥

我要改劇本忽略的話哪個wikipedia scrapper無法加載 有沒有辦法用一個腳本找出所有錯誤的單詞？

來源

2017-01-18 Piteight

使用try-除本：

try: 
    <get the article> 
except wikipedia.exceptions.PageError as e: 
    if "does not match any pages" in str(e): 
     <ignore the error> 
    else: 
     # Some other error jumped out, so do not ignore it: 
     raise

現在，這是不是100％肯定，因爲頁面的名稱可能是「不相匹配的頁面」，在理論上。

因此，您確實需要輸入變量e中捕獲的異常，並且只能看到該消息或者是否有錯誤編號或其他內容。

因爲我認爲PageError（）可以引發超過頁面未找到。

我不知道PageError（）異常是怎麼做的，但也許是：

e.msg

或

e.message

應該給你的，而不是在str中檢查（E）真實的東西

來源

2017-01-18 21:44:59 Dalen

謝謝，我認爲就是這樣。我沒有得到'raise'的東西，我應該把其他錯誤信息放在'else'中嗎？在if語句中，我添加了'wiki = wikipedia.page（lista [i + 1]）'來獲得下一篇文章。我需要編寫更復雜的代碼。有一種錯誤信息給我列出了可能的維基百科文章。我認爲應該有一個選項來抓住第一個並閱讀文章。 – Piteight

你可以把：提高e，如果它看起來更好。但沒有任何提高就會提高錯誤嘗試捕捉。轉到您的Python站點包目錄，並閱讀wikipedia/exceptions.py以查看PageError（）是如何正確工作的，以及它在哪種情況下會具有哪些屬性。還有文檔。您也許可以使用wikipedia.search（）而不是直接調用頁面。 – Dalen

維基百科用維基百科1.4.0廢棄：如何跳過不好的結果？

回答

相關問題