使用Python刮寫網頁中的Javascript文本

我目前正在嘗試從各個餐廳的TripAdvisor網站上提取經度和緯度。我正在瀏覽香港這間餐廳的HTML。使用Python刮寫網頁中的Javascript文本

Restaurant I am attempting to scrape from

在HTML，我發現這一點：

HTML Code with the Latitude and Longitude

我想刮從這裏緯度和經度，但我似乎無法把它弄出來，當我試圖打印它。以下是我的代碼，任何建議都會有所幫助。

#import libraries 
import requests 
from bs4 import BeautifulSoup 
import csv 

#loop to move into the next pages. entries are in increments of 30 per page 
for i in range(0, 1, 30): 
    #need this here for when you want more than 30 
    while i <= range: 
     i = str(i) 
    #url format offsets the restaurants in increments of 30 after the oa 
    url1 = 'https://www.tripadvisor.com/Restaurants-g294217-oa' + i + '-Hong_Kong.html#EATERY_LIST_CONTENTS' 
    r1 = requests.get(url1) 
    data1 = r1.text 
    soup1 = BeautifulSoup(data1, "html.parser") 
    for link in soup1.findAll('a', {'property_title'}): 
     #print 'https://www.tripadvisor.com/Restaurant_Review-g294217-' + link.get('href') 
     restaurant_url = 'https://www.tripadvisor.com/Restaurant_Review-g294217-' + link.get('href') 
     #print restaurant_url 
     r2 = requests.get(restaurant_url) 
     data2 = r2.text 
     soup2 = BeautifulSoup(data2, "html.parser") 
     for script in soup2.findAll('script', {'type', 'text/javascript', 'lat'}): 
      print script.string

來源

2016-11-10 dtrinh

要抓取JavaScript供電的頁面，您需要使用selenium。

來源

2016-11-10 19:12:04 amirouche

Selenium要求Python 3.4及更高版本正確嗎？ – dtrinh

不，它在python 2.7中可用 – amirouche

使用Python刮寫網頁中的Javascript文本

回答

相關問題