我想迭代通過一個列表,以獲得一個網站的子類別與多個網頁的一些鏈接。子類別中的第一個鏈接具有列表中的第一個數字(8),第二個鏈接將具有6,依此類推。我的最終結果要看起來像這樣:從列表循環迭代
sublinks:
0 https://messageboards.webmd.com/family-pregnancy/f/relationships/
1 https://messageboards.webmd.com/family-pregnancy/f/parenting/
2 https://messageboards.webmd.com/family-pregnancy/f/pets/
3 https://messageboards.webmd.com/family-pregnancy/f/pregnancy/
列表嘗試itarate在for循環:[8,6,5,13,10,16,13,15,4,4,5,7, 2,6,6,8,9,8,3,8,8,1,6,3,2,15,5,4,2,12,18,5,2]
import bs4 as bs
import urllib.request
import pandas as pd
import urllib.parse
import re
#source = urllib.request.urlopen('https://messageboards.webmd.com/').read()
source = urllib.request.urlopen('https://messageboards.webmd.com').read()
soup = bs.BeautifulSoup(source,'lxml')
df = pd.DataFrame(columns = ['link'],data=[url.a.get('href') for url in soup.find_all('div',class_="link")])
lists =[]
lists2=[]
lists3=[]
page_links = []
for i in range(0,33):
link = (df.link.iloc[i])
req = urllib.request.Request(link)
resp = urllib.request.urlopen(req)
respData = resp.read()
temp1=re.findall(r'Filter by</span>(.*?)data-pagedcontenturl',str(respData))
temp1=re.findall(r'data-totalitems=(.*?)data-pagekey',str(temp1))[0]
pageunm=round(int(re.sub("[^0-9]","",temp1))/10)
lists.append(pageunm)
for j in lists:
for x in range(1, j+1):
url_pages = link + '#pi157388622=' + str(j)
page_links.append(url_pages)
我對於第一次迭代最終結果要這個樣子:
https://messageboards.webmd.com/family-pregnancy/f/relationships/#pi157388622=1
https://messageboards.webmd.com/family-pregnancy/f/relationships/#pi157388622=2
https://messageboards.webmd.com/family-pregnancy/f/relationships/#pi157388622=3
https://messageboards.webmd.com/family-pregnancy/f/relationships/#pi157388622=4
https://messageboards.webmd.com/family-pregnancy/f/relationships/#pi157388622=5
https://messageboards.webmd.com/family-pregnancy/f/relationships/#pi157388622=6
https://messageboards.webmd.com/family-pregnancy/f/relationships/#pi157388622=7
https://messageboards.webmd.com/family-pregnancy/f/relationships/#pi157388622=8
你的問題是什麼?什麼目前不工作?您的列表(「嘗試迭代的列表」)與您期望的(「想要看起來像」)示例(1,2,3,4 ...)有不同的數字(8,6,5,14 ...) 。你想要什麼? –
我想做一個循環,在列表中的數字範圍內迭代。例如第一個將在1-8的範圍內。第二個會從子類別獲得第二個鏈接並從1-6 – Data1234
開始,那麼你的問題是什麼?哪部分不工作?它在做什麼?它應該做什麼? – wwii