2014-10-30 94 views
1

代碼:beautifulsoup:不能提取所有的元素在一個循環

from bs4 import BeautifulSoup 
soup = BeautifulSoup('<div><p>p_string</p><div>div_string</div></div>') 
for m in soup.div: 
    print "extract(first loop): ", m.extract() 
print "current soup.div(frist loop): ", soup.div #it contains another div block 
print '___________________________________________________________' 

#I have to do another for loop to purge the remaining div block, why? 
for m in soup.div: 
    print "extract(second loop): ", m.extract() 

print "current soup.div(second loop): ", soup.div #removed 

結果:

extract(first loop): <p>p_string</p> 
current soup.div(frist loop): <div><div>div_string</div></div> 
___________________________________________________________ 
extract(second loop): <div>div_string</div> 
current soup.div(second loop): <div></div> 

爲什麼沒有把它提取的所有元素( pdiv)在第一個for循環?

回答