因此,我正在爲用戶「Sri」發佈的所有「餐館點評」(而不是自己的評論的自我評論)抓取此特定網頁https://www.zomato.com/srijata。打印網頁的某些文檔元素的所有發生
zomato_ind = urllib2.urlopen('https://www.zomato.com/srijata')
zomato_info = zomato_ind.read()
open('zomato_info.html', 'w').write(zomato_info)
soup = BeautifulSoup(open('zomato_info.html'))
soup.find('div','mtop0 rev-text').text
這將打印了她的第一家餐廳的評論,即 - 「斯里蘭卡審查大草帽 - 啃這種」爲: -
u'Rated This is situated right in the heart of the city. The items on the menu are alright and I really had to compromise for bubble tea. The tapioca was not fresh. But the latte and the soda pop my friends tried was good. Another issue which I faced was mosquitos... They almost had me.. Lol..'
我也嘗試另一個選擇: -
我有這樣的問題, : -
如何打印下一家餐廳評論?我試過findNextSiblings等,但都沒有看起來工作。
爲什麼保存在一個文件中的HTML然後將該文件讀入湯對象? – 2014-10-01 12:22:02
這是我做的一項措施,以避免連續擊中網站,從而遵循安全措施,防止刮擦! – shalini 2014-10-02 05:41:56