我使用相同的代碼來獲取網絡文本,但大部分時間顯示「WARNING:root:某些字符無法解碼,並替換爲REPLACEMENT CHARACTER。 「,並且令人驚訝的是它有時會工作,例如我運行代碼12次,1次成功。Python BeautifulSoup擷取網頁,開啓和關閉相同的代碼
相同的代碼,相同的網址。這是爲什麼發生?
from bs4 import BeautifulSoup
import re
import urllib2
url = "http://nz.sports.search.yahoo.com/search?p=basketball&fr=sports-nz-ss&age=1w&focuslim=age"
page = urllib2.urlopen(url)
soup = BeautifulSoup(page.read())
web_p = soup.find_all('span',class_='url')
for web in web_p:
print web
引用的細節,如下面:
Traceback (most recent call last):
File "C:\Python27\lib\idlelib\run.py", line 112, in main
seq, request = rpc.request_queue.get(block=True, timeout=0.05)
File "C:\Python27\lib\Queue.py", line 176, in get
raise Empty
Empty
張貼時引發錯誤出現的回溯。 – tsroten
[美麗的湯,獲取警告,然後錯誤中途通過代碼]可能的重複(http://stackoverflow.com/questions/17688063/beautiful-soup-gets-warning-and-then-error-halfway-through-code) – isedev