的Python的urllib2 + Beautifulsoup

所以我在努力實現美麗到我目前的Python項目，好了，保持這個簡單明瞭的，我會減少我的當前腳本的複雜性。的Python的urllib2 + Beautifulsoup

腳本，而不BeautifulSoup -

import urllib2 

    def check(self, name, proxy): 
     urllib2.install_opener(
      urllib2.build_opener(
       urllib2.ProxyHandler({'http': 'http://%s' % proxy}), 
       urllib2.HTTPHandler() 
       ) 
      ) 

     req = urllib2.Request('http://example.com' ,"param=1") 
     try: 
      resp = urllib2.urlopen(req) 
     except: 
      self.insert() 
     try: 
      if 'example text' in resp.read() 
       print 'success'

現在當然壓痕是錯誤的，這只是勾畫了什麼，我都怎麼回事，你可以簡單地說，我發送POST請求「 example.com「&然後如果example.com在resp.read打印成功中包含」示例文本「。

但是其實我是想是檢查

if ' example ' in resp.read()

然後輸出內TD文本來自example.com對齊要求使用

soup.find_all('td', {'align':'right'})[4]

現在我實現beautifulsoup的方式並不工作，這樣的例子 -

import urllib2 
from bs4 import BeautifulSoup as soup 

main_div = soup.find_all('td', {'align':'right'})[4] 

    def check(self, name, proxy): 
     urllib2.install_opener(
      urllib2.build_opener(
       urllib2.ProxyHandler({'http': 'http://%s' % proxy}), 
       urllib2.HTTPHandler() 
       ) 
      ) 

     req = urllib2.Request('http://example.com' ,"param=1") 
     try: 
      resp = urllib2.urlopen(req) 
      web_soup = soup(urllib2.urlopen(req), 'html.parser') 
     except: 
      self.insert() 
     try: 
      if 'example text' in resp.read() 
       print 'success' + main_div

現在你看到我添加了4個新的線/調整

from bs4 import BeautifulSoup as soup 

web_soup = soup(urllib2.urlopen(url), 'html.parser') 

main_div = soup.find_all('td', {'align':'right'})[4] 

aswell as " + main_div " on print

然而，它只是似乎並不奏效，我也曾有過一些失誤，而調整其中有一些說「轉讓之前本地變量引用」 &「不受約束的方法find_all必須調用與beautifulsoup實例作爲第一個參數「

來源

2017-08-11 user3255841

請嘗試提供一個最簡單的示例。也嘗試正確縮進你的python代碼。關於你的最後一個代碼示例，您應該叫''上可變web_soup' find_all'：'web_soup.find_all（ 'TD'，{ '對齊'： '右'}）'。 – johannesmik

我確實說它只是一個略圖，縮進是錯誤的，但我只是主要想知道如何把它放在一起，無論縮進，因爲我可以自己修復這個問題..我也不明白你的意思，你的意思是將main_div = soup切換到main_div = web_soup？ – user3255841

如果你想讓你的問題得到解答，你應該提供一個最小的例子，讓其他人可以理解你的問題。你提供了3個片段，它們都是不完整的或者有錯誤的縮進，所以很難重建你的問題。 – johannesmik

關於你的最後一個代碼片段：

from bs4 import BeautifulSoup as soup 

web_soup = soup(urllib2.urlopen(url), 'html.parser') 
main_div = soup.find_all('td', {'align':'right'})[4]

請立即撥打了web_soup實例find_all。在使用之前，請務必定義url變量：

from bs4 import BeautifulSoup as soup 

url = "url to be opened" 
web_soup = soup(urllib2.urlopen(url), 'html.parser') 
main_div = web_soup.find_all('td', {'align':'right'})[4]

來源

2017-08-11 08:40:32 johannesmik

的Python的urllib2 + Beautifulsoup

回答

相關問題