2015-08-28 47 views
-1

我的目標是使用BeautifulSoup網頁刮取Google搜索結果。我使用Anaconda Python並使用Ipython作爲IDE控制檯。爲什麼在運行以下命令時不能獲得輸出?使用BeautifulSoup網頁刮取Google搜索結果

def google_scrape(query): 
    address = "http://www.google.com/search?q=%s&num=100&hl=en&start=0" % (urllib.quote_plus(query)) 
    request = urllib2.Request(address, None, {'User-Agent':'Mosilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.57 Safari/536.11'}) 
    urlfile = urllib2.urlopen(request) 
    page = urlfile.read() 
    soup = BeautifulSoup(page) 

    linkdictionary = {} 

    for li in soup.findAll('li', attrs={'class':'g'}): 
     sLink = li.find('a') 
     print sLink['href'] 
     sSpan = li.find('span', attrs={'class':'st'}) 
     print sSpan 

    return linkdictionary 

if __name__ == '__main__': 
    links = google_scrape('english') 

回答

0

你永遠不會添加任何東西linkedDictionary

def google_scrape(query): 
    address = "http://www.google.com/search?q=%s&num=100&hl=en&start=0" % (urllib.quote_plus(query)) 
    request = urllib2.Request(address, None, {'User-Agent':'Mosilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.57 Safari/536.11'}) 
    urlfile = urllib2.urlopen(request) 
    page = urlfile.read() 
    soup = BeautifulSoup(page) 

    linkdictionary = {} 

    for li in soup.findAll('li', attrs={'class':'g'}): 
     sLink = li.find('a') 
     sSpan = li.find('span', attrs={'class':'st'}) 

     linkeDictionary['href'] = sLink['href'] 
     linkedDictionary['sSpan'] = sSpan 

    return linkdictionary 

if __name__ == '__main__': 
    links = google_scrape('english') 
+0

我應該被加入到了嗎? – Umayangani

+0

無論你想返回什麼 –

+0

我想要網頁刮取谷歌結果我得到的單詞'英語'。你可以請怎麼編輯這個代碼來做到這一點? – Umayangani