2016-02-26 374 views
1

刮我想從韋氏字典定義刮。防爆。 http://www.merriam-webster.com/dictionary/abandon從內部類

這是我想刮的代碼片段。

<div class="definition-block def-text"> 
     <ul class="definition-list no-count"> 
         <li> 
       <p class="definition-inner-item"> 
       <span><span class="intro-colon">:</span> to leave and never return to (someone who needs protection or help)</span> 
       </p> 
      </li> 
         <li> 
       <p class="definition-inner-item"> 
       <span><span class="intro-colon">:</span> to leave and never return to (something)</span> 
       </p> 
      </li> 
         <li> 
       <p class="definition-inner-item"> 
       <span><span class="intro-colon">:</span> to leave (a place) because of danger</span> 
       </p> 
      </li> 
        </ul> 
     </div> 

這裏是我的代碼

for element in soup.find(class_="definition-list no-count"): 
    if(soup.find("li")): 
     print element 

輸出是

<li> 
<p class="definition-inner-item"> 
<span><span class="intro-colon">:</span> to leave and never return to (someone who needs protection or help)</span> 
</p> 
</li> 


<li> 
<p class="definition-inner-item"> 
<span><span class="intro-colon">:</span> to leave and never return to (something)</span> 
</p> 
</li> 


<li> 
<p class="definition-inner-item"> 
<span><span class="intro-colon">:</span> to leave (a place) because of danger</span> 
</p> 
</li> 

但我想<span>裏面的定義。如果我使用get_text()方法,則會出現類型錯誤。

for element in soup.find(class_="definition-list no-count"): 
     if(soup.find("li")): 
      print soup.get_text(element) 

輸出:

Traceback (most recent call last): 
    File "scrape.py", line 18, in <module> 
    print soup.get_text(element) 
    File "/usr/lib/python2.7/dist-packages/bs4/element.py", line 852, in get_text 
    strip, types=types)]) 
TypeError: 'NoneType' object is not callable 
+0

和你的代碼? – dnit13

回答

0

你有沒有考慮過使用beautifulsoup完成這個任務?我相信你可以做到這一點其他的方法,但beautifulsoup是微不足道:

from bs4 import BeautifulSoup 
import urllib 
r = urllib.urlopen('http://www.merriam-webster.com/dictionary/abandon').read() 
soup = BeautifulSoup(r) 
definitions = soup.find_all("p", class_="definition-inner-statement") 

,然後你可以用定義,你需要做的。

+0

並沒有我的情況下工作。返回空列表。 – dhiraj

+0

我並沒有想我給你確切的代碼,因此它可能有一個錯誤的地方,但總的一點,我做是使用beautifulsoup。這太自以爲是,以期待人們在互聯網上只是寫出來你的整個程序爲您服務。 – ubadub