我該如何刮這個特殊的jQuery網站與python？

我想刮這個網站：https://resultadoselecciones2016.onpe.gob.pe/PRP2V2016/Actas-por-Ubigeo.html我該如何刮這個特殊的jQuery網站與python？

他們正在使用jQuery，所以數據不在「正常」的HTML代碼。我看到這個Chrome開發者控制檯上：

所以我這樣做對Python的2.7：

import urllib 
import urllib2 

url = 'https://resultadoselecciones2016.onpe.gob.pe/PRP2V2016/Actas-por-Ubigeo.html' 

data = "pid=844399127479680.2&_clase=mesas&_accion=displayMesas&ubigeo=140107&nroMesa=034915&tipoElec=10&page=1&pornumero=1" 

req = urllib2.Request(url, data) 
response = urllib2.urlopen(req) 
print response.read()

但它不工作，它只是打印正常HTML，而不是你在上面看到的迴應。

我該如何獲得這些數據？

來源

2016-06-08 Kevin Castro

您需要在您的服務器上運行無頭瀏覽器 – charlietfl

您可以使用Selenium或RoboBrowser執行此類任務。 –

我剛剛解決了這個問題。我用requests模塊，而不是urllib，只是複製/粘貼整個頭，像這樣：

import requests 
from bs4 import BeautifulSoup 

url2 = "https://resultadoselecciones2016.onpe.gob.pe/PRP2V2016/ajax.php" 
head = "[my entire header]" 
data_get_departamentos = "pid=1037937475037058.5&_clase=ubigeo&_accion=getDepartamentos&dep_id=&tipoElec=&tipoC=acta&modElec=&ambito=E&pantalla=" 

r = requests.post(url2, data=data_get_departamentos, headers=head) 
departamentos = r.text

然後我用Beautifulsoup解析HTML響應。就這樣。

來源

2016-06-08 17:00:24

我該如何刮這個特殊的jQuery網站與python？

回答

相關問題