-1
變量j
的嵌套循環無法正常工作。即使在它似乎被正確初始化之前需要的變量,調試器也會跳過它。Python,網頁抓取:嵌套循環無法正常工作
from urllib.request import Request, urlopen
# Get beautifulsoup4 with: pip install beautifulsoup4
import bs4
import pdb
import sys
import json
site = "http://bgp.he.net/report/world"
hdr = {'User-Agent': 'Mozilla/5.0'}
req = Request(site,headers=hdr)
page = urlopen(req)
soup = bs4.BeautifulSoup(page, 'html.parser')
for t in soup.find_all('td', class_='centeralign'):
s = str(t.string)
if s != "None":
print (s.strip())
site2 = "http://bgp.he.net/country/" + s.strip()
req = Request(site2,headers=hdr)
soup2 = bs4.BeautifulSoup(page, 'html.parser')
for j in soup2.find_all('td'):
s2 = str(j.string)
print (j.strip())
你想要的輸出? – Gahan
你也試圖一次又一次地解析相同的頁面。 – Gahan
[使用bs4提取除表頭信息]的可能副本(https://stackoverflow.com/questions/37635847/extracting-information-from-a-table-except-header-of-the-table -using-bs4) – stovfl