代替主犯罪容器中,有僅此接收由urlopen
:
<div id="table_container" class="list-group crime-list" style="margin-top: -30px;">
<h3>Loading Crime Data...</h3>
<p>City and county crime map showing crime incident data down to neighborhood crime</p>
</div>
這是因爲,使主容器用另外的API調用的幫助構造成http://api.spotcrime.com/crimes.json
端點和正在執行的JavaScript邏輯在瀏覽器中。
你可以做的是在你的代碼中用requests
模擬那個API調用。工作示例:
import requests
url = "http://spotcrime.com/#77801"
crimes_url = "http://api.spotcrime.com/crimes.json"
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.130 Safari/537.36'}
with requests.Session() as session:
session.headers = headers
session.get(url)
data = {
"lat": "30.6423514",
"lon": "-96.3704778",
"radius": "0.02",
"key": "spotcrime-private-api-key",
"_": "1435453242689"
}
response = session.get(crimes_url, data=data)
response = response.json()
for item in response["crimes"]:
print item
它打印每一行相對應的犯罪表字典:
{u'cdid': 64482204, u'lon': -96.3661035, u'lat': 30.6507387, u'link': u'http://spotcrime.com/crime/64482204-6737a0085bd9aff31548993910efa35a', u'address': u'2000 BLOCK OF S COLLEGE AV', u'date': u'06/24/15 08:47 PM', u'type': u'Theft'}
{u'cdid': 64482189, u'lon': -96.3594859, u'lat': 30.6299681, u'link': u'http://spotcrime.com/crime/64482189-345f4eca1c977f43e97ea4981f73d4de', u'address': u'3600 BLOCK OF WELLBORN RD', u'date': u'06/24/15 07:32 PM', u'type': u'Vandalism'}
...
{u'cdid': 64370976, u'lon': -96.361556, u'lat': 30.631685, u'link': u'http://spotcrime.com/crime/64370976-dc6e6dbb29fc7376c2b82356c45d281d', u'address': u'3600 BLOCK OF WELLBORN RD #802', u'date': u'06/18/15 12:37 PM', u'type': u'Arrest'}
{u'cdid': 64371003, u'lon': -96.3539954, u'lat': 30.6434707, u'link': u'http://spotcrime.com/crime/64371003-d9934d9b9d83c1867871701874c45523', u'address': u'2900 BLOCK OF S TEXAS AVENUE', u'date': u'06/18/15 09:56 AM', u'type': u'Vandalism'}