我寫了一個爬蟲從Q &網站獲取信息。由於並非所有字段都始終顯示在頁面中,因此我使用了多個try-excepts來處理這種情況。捕獲異常獲取UnboundLocalError
def answerContentExtractor(loginSession, questionLinkQueue , answerContentList) :
while True:
URL = questionLinkQueue.get()
try:
response = loginSession.get(URL,timeout = MAX_WAIT_TIME)
raw_data = response.text
#These fields must exist, or something went wrong...
questionId = re.findall(REGEX,raw_data)[0]
answerId = re.findall(REGEX,raw_data)[0]
title = re.findall(REGEX,raw_data)[0]
except requests.exceptions.Timeout ,IndexError:
print >> sys.stderr, URL + " extraction error..."
questionLinkQueue.task_done()
continue
try:
questionInfo = re.findall(REGEX,raw_data)[0]
except IndexError:
questionInfo = ""
try:
answerContent = re.findall(REGEX,raw_data)[0]
except IndexError:
answerContent = ""
result = {
'questionId' : questionId,
'answerId' : answerId,
'title' : title,
'questionInfo' : questionInfo,
'answerContent': answerContent
}
answerContentList.append(result)
questionLinkQueue.task_done()
而這個代碼,有時,可用可不用的,給人運行期間,以下情況例外:
UnboundLocalError: local variable 'IndexError' referenced before assignment
的行號表示在第二except IndexError:
感謝大家發生錯誤你的建議,願意給予你應得的分數,太差我只能標記一個作爲正確的答案...
錯別字,我用手輸入它條紋一些聯合國需要的行..已編輯.. –
相關:[在一行中有多個例外(塊除外) ](http://stackoverflow.com/questions/6470428/catch-multiple-exceptions-in-one-line-except-block?rq=1) – thefourtheye