爲什麼我的抓取工具使用BeautifulSoup，不顯示結果？

-3

import requests 
from bs4 import BeautifulSoup 

def code_search(max_pages): 
    page = 1 
    while page <= max_pages: 
     url = 'http://kindai.ndl.go.jp/search/searchResult?searchWord=朝鮮&facetOpenedNodeIds=&featureCode=&viewRestrictedList=&pageNo=' + str(page) 
     source_code = requests.get(url) 
     plain_text = source_code.text 
     soup = BeautifulSoup(plain_text, 'html.parser') 
     for link in soup.findAll('a', {'class': 'item-link'}): 
      href = link.get('href') 
      page += 1 

code_search(2)

我pycharm版本是pycharm社區-5.0.3的Mac。

它只是說：

"Process finished with exit code 0"

但應該有一定的結果，如果我已經寫的代碼相應...

請幫我在這裏！

來源

2016-01-03 spitfire

首先，在所有的，你忘了'print'。然後，你想得到什麼？ –

沒有'return'否'print' ...你怎麼看待輸出？ –

你有什麼預期的輸出？ –

您沒有print語句 - 所以程序不輸出任何內容。

添加一些打印語句。例如，如果輸出鏈接，請執行以下操作：

for link in soup.findAll('a', {'class': 'item-link'}): 
     href = link.get('href') 
     print(href) 
     page += 1

來源

2016-01-03 13:52:59 masnun

答案取決於您想用網絡爬蟲實現的目標。第一個觀察是沒有印刷任何東西。

以下代碼顯示URL和網址上找到的所有鏈接。

import requests 
from bs4 import BeautifulSoup 

def code_search(max_pages): 
    page = 1 
    while page <= max_pages: 
     url = 'http://kindai.ndl.go.jp/search/searchResult?searchWord=朝鮮&facetOpenedNodeIds=&featureCode=&viewRestrictedList=&pageNo=' + str(page) 
     print("Current URL:", url) 
     source_code = requests.get(url) 
     plain_text = source_code.text 
     soup = BeautifulSoup(plain_text, 'html.parser') 
     for link in soup.findAll('a', {'class': 'item-link'}): 
      href = link.get('href') 
      print("Found URL:", href) 
      page += 1 

code_search(2)

也可以讓該方法返回所有找到的網址，然後打印結果：

import requests 
from bs4 import BeautifulSoup 

def code_search(max_pages): 
    page = 1 
    urls = [] 
    while page <= max_pages: 
     url = 'http://kindai.ndl.go.jp/search/searchResult?searchWord=朝鮮&facetOpenedNodeIds=&featureCode=&viewRestrictedList=&pageNo=' + str(page) 
     source_code = requests.get(url) 
     plain_text = source_code.text 
     soup = BeautifulSoup(plain_text, 'html.parser') 
     for link in soup.findAll('a', {'class': 'item-link'}): 
      href = link.get('href') 
      urls.append(href) 
      page += 1 
    return urls 

print("Found URLs:", code_search(2))

來源

2016-01-03 13:53:58

爲什麼我的抓取工具使用BeautifulSoup，不顯示結果？

回答

相關問題