解析Google搜索結果的BeautifulSoup腳本停止工作

我想用Python解析Google搜索結果。一切都很完美，但現在我一直在收到一個空的列表。以下是用於正常工作的代碼：解析Google搜索結果的BeautifulSoup腳本停止工作

query = urllib.urlencode({'q': self.Tagsinput.GetValue()+footprint,'ie': 'utf-8', 'num':searchresults, 'start': '100'}) 
result = url + query1 
myopener = MyOpener() 
page = myopener.open(result) 
xss = page.read() 
soup = BeautifulSoup.BeautifulSoup(xss) 
contents = [x['href'] for x in soup.findAll('a', attrs={'class':'l'})]

此腳本在12月完美運行，現在停止工作。

據我瞭解的問題是在這條線：

contents = [x['href'] for x in soup.findAll('a', attrs={'class':'l'})]

當我打印內容的程序返回一個空列表：[]

請，任何人，幫助。

來源

2011-02-03 Slava

您是否試圖向正常的Google搜索Web界面發出自動請求？如果他們阻止了你，你不應該感到驚訝;使用他們的API。 – geoffspear 2011-02-03 14:26:32

The API也有更好的效果。簡單的JSON，您可以輕鬆解析和操作。

import urllib, json 
BASE_URL = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&' 
url = BASE_URL + urllib.urlencode({'q' : SearchTerm.encode('utf-8')}) 
raw_res = urllib.urlopen(url).read() 
results = json.loads(raw_res) 
hit1 = results['responseData']['results'][0] 
prettyresult = ' - '.join((urllib.unquote(hit1['url']), hit1['titleNoFormatting']))

來源

2011-02-03 15:04:53 chmullig

解析Google搜索結果的BeautifulSoup腳本停止工作

回答

相關問題