2017-07-19 36 views
0

這是我創建的程序,用於從列表中的每個類別部分中提取最大頁面值。我無法獲取所有值,I我只是得到列表中最後一個值的值。爲了獲得所有輸出,我需要做些什麼改變。Python:Page Navigator Maximum Value Scrapper - 僅獲取最後一個值的輸出

import bs4 
from urllib.request import urlopen as uReq 
from bs4 import BeautifulSoup as soup 

#List for extended links to the base url 

links = ['Link_1/','Link_2/','Link_3/'] 
#Function to find out the biggest number present in the page navigation 
#section.Every element before 'Next→' is consist of the upper limit 

def page_no(): 
    bs = soup(page_html, "html.parser") 
    max_page = bs.find('a',{'class':'next page-numbers'}).findPrevious().text 
    print(max_page) 

#url loop 
for url in links: 
    my_urls ='http://example.com/category/{}/'.format(url) 

# opening up connection,grabbing the page 
uClient = uReq(my_urls) 
page_html = uClient.read() 
uClient.close() 
page_no() 

頁面導航實例: 1 2 3 … 15 Next →

由於提前

+0

請給真正的網址你解析 –

回答

0

你需要把page_html在函數內部和縮進的最後4行。此外,最好返回max_page值,以便您可以使用它的功能。

def page_no(page_html): 
    bs = soup(page_html, "html.parser") 
    max_page = bs.find('a',{'class':'next page-numbers'}).findPrevious().text 
    return max_page 

#url loop 
for url in links: 
    my_urls='http://example.com/category/{}/'.format(url) 
    # opening up connection,grabbing the page 
    uClient = uReq(my_urls) 
    page_html = uClient.read() 
    uClient.close() 
    max_page = page_no(page_html) 
    print(max_page) 
相關問題