Cloudflare抓取，查找元素

我一直在玩cfscrape模塊，它允許您繞過站點上的cloudflare captcha保護...我訪問了頁面的內容，但似乎無法讓我的代碼工作，而是整個HTML被打印。我只是想給內<span class="availability">Cloudflare抓取，查找元素

import urllib2 
import cfscrape 
from bs4 import BeautifulSoup 
import requests 
from lxml import etree 
import smtplib 
import urllib2, sys 
scraper = cfscrape.CloudflareScraper() 
url = "http://www.sneakersnstuff.com/en/product/25698/adidas-stan-smith-gtx" 
req = scraper.get(url).content 


try: 
    page = urllib2.urlopen(req) 
except urllib2.HTTPError, e: 
    print("hi") 
    content = e.fp.read() 


soup = BeautifulSoup(content, "lxml") 
result = soup.find_all("span", {"class":"availability"})

查找關鍵字我省略了的代碼

來源

2016-12-31 ColeWorld

try: 
    page = urllib2.urlopen(req) 
    content = page.read() 
except urllib2.HTTPError, e: 
    print("hi")

一些無關緊要的部分，就應該閱讀包含HTML代碼中的urlopen的對象。

你應該把content變量放在except之前。

來源

2016-12-31 09:08:41

你是否熟悉ConnectionError：（'Connection aborted。'，BadStatusLine'錯誤？不知道爲什麼我得到這個.. – ColeWorld

@ColeWorld你應該發佈其他問題，而不是在評論中提出新問題。接受這個答案來關閉這個問題。 –

Cloudflare抓取，查找元素

回答

相關問題