0
所以我一直在尋找我最喜歡的軟件。後來我發現有關Web刮我發現它真的很神奇所以用我的蟒蛇的經驗,我在一些美麗的湯和要求得到了一些實踐和下面的代碼Web Scraping不能正常工作?
import html5lib
import requests
from bs4 import BeautifulSoup as BS
# Get all the a strings , next siblings and next siblings
def makeSoup(urls):
url = requests.get(urls).text
return BS(url,"html5lib")
def something(soup):
for anchor in soup.findAll("a",{"data-type":"externalLink"}):
print(anchor.string)
next_sibling = anchor.nextSibling
water = str(next_sibling.string)
water = water[0:5]
while water != "(202)":
next_sibling = next_sibling.nextSibling
if next_sibling == None:
continue
if next_sibling.string != None:
print(next_sibling.string)
water = str(next_sibling.string)
water = water[0:5]
soup = makeSoup("http://dc.about.com/od/communities/a/EmbassyGuide.htm")
something(soup)
soup = makeSoup("http://dc.about.com/od/communities/a/EmbassyGuide_2.htm")
something(soup)
soup = makeSoup("http://dc.about.com/od/communities/a/EmbassyGuide_3.htm")
something(soup)
<!-- begin snippet: js hide: false console: true babel: false -->
但遺憾的是所有的程序員噩夢錯誤。
Traceback (most recent call last):
File "C:\Users\Raj\Desktop\kunal projects\Python\listing_out_all_embassies.py", line 26, in <module>
something(soup)
File "C:\Users\Raj\Desktop\kunal projects\Python\listing_out_all_embassies.py", line 17, in something
next_sibling = next_sibling.nextSibling
AttributeError: 'NoneType' object has no attribute 'nextSibling'
錯了我在做什麼,我是一個新手,編程以及Web的抓取。那麼有什麼好的做法,我不是遵循 無論如何,感謝閱讀,直到結束。
那'continue'看起來不正確。 – user2357112