0
我目前正在嘗試從各個餐廳的TripAdvisor網站上提取經度和緯度。我正在瀏覽香港這間餐廳的HTML。使用Python刮寫網頁中的Javascript文本
Restaurant I am attempting to scrape from
在HTML,我發現這一點:
HTML Code with the Latitude and Longitude
我想刮從這裏緯度和經度,但我似乎無法把它弄出來,當我試圖打印它。以下是我的代碼,任何建議都會有所幫助。
#import libraries
import requests
from bs4 import BeautifulSoup
import csv
#loop to move into the next pages. entries are in increments of 30 per page
for i in range(0, 1, 30):
#need this here for when you want more than 30
while i <= range:
i = str(i)
#url format offsets the restaurants in increments of 30 after the oa
url1 = 'https://www.tripadvisor.com/Restaurants-g294217-oa' + i + '-Hong_Kong.html#EATERY_LIST_CONTENTS'
r1 = requests.get(url1)
data1 = r1.text
soup1 = BeautifulSoup(data1, "html.parser")
for link in soup1.findAll('a', {'property_title'}):
#print 'https://www.tripadvisor.com/Restaurant_Review-g294217-' + link.get('href')
restaurant_url = 'https://www.tripadvisor.com/Restaurant_Review-g294217-' + link.get('href')
#print restaurant_url
r2 = requests.get(restaurant_url)
data2 = r2.text
soup2 = BeautifulSoup(data2, "html.parser")
for script in soup2.findAll('script', {'type', 'text/javascript', 'lat'}):
print script.string
Selenium要求Python 3.4及更高版本正確嗎? – dtrinh
不,它在python 2.7中可用 – amirouche