2017-01-08 77 views
2

我試着去湊通過關鍵字鏈接此XML頁面,但urllib2的是扔我的錯誤,我不能python3修復...語法錯誤:無效的語法:除了urllib2.HTTPError,E:

from bs4 import BeautifulSoup 
import requests 
import smtplib 
import urllib2 
from lxml import etree 
url = 'https://store.fabspy.com/sitemap_products_1.xml?from=5619742598&to=9172987078' 
hdr = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11', 
     'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 
     'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3', 
     'Accept-Encoding': 'none', 
     'Accept-Language': 'en-US,en;q=0.8', 
     'Connection': 'keep-alive'} 
proxies = {'https': '209.212.253.44'} 
req = urllib2.Request(url, headers=hdr, proxies=proxies) 
try: 
    page = urllib2.urlopen(req) 
except urllib2.HTTPError as e: 
    print(e.fp.read()) 
content = page.read() 
def parse(self, response): 
    try: 
     print(response.status) 
     print('???????????????????????????????????') 
     if response.status == 200: 
      self.driver.implicitly_wait(5) 
      self.driver.get(response.url) 
      print(response.url) 
      print('!!!!!!!!!!!!!!!!!!!!') 

      # DO STUFF 
    except httplib.BadStatusLine: 
     pass 
while True: 
    soup = BeautifulSoup(a.context, 'lxml') 
    links = soup.find_all('loc') 
    for link in links: 
     if 'notonesite' and 'winter' in link.text: 
      print(link.text) 
      jake = link.text 

我只是試圖通過代理髮送urllib請求,以查看鏈接是否在站點地圖上...

+0

既然你是在Python 3上,你應該得到「No module named urllib2」而不是(http://stackoverflow.com/questions/2792650/python3-error-import-error-no-module-name-urllib2) 。 – alecxe

回答

4

urllib2在Python3中不可用。你應該用urllib.errorurllib.request

import urllib.request 
import urllib.error 
... 
req = (url, headers=hdr) # doesn't take a proxies argument though... 
... 
try: 
    page = urllib.request.urlopen(req) 
except urllib.error.HTTPError as e: 
... 

...等等。但是請注意,urllib.request.Request()不需要參數proxies。代理處理請參閱the documentation

+0

嗯...看來python 2.7更容易用於我的目標......該版本中的等效物是什麼? – ColeWorld

+0

看來你已經在使用Python 2.7的等價物了。嘗試使用Python 2運行現有的代碼,並查看是否有任何新的錯誤消息。 – elethan