0
我試過其他類型的CSS選擇器和xpaths,所以我假設我可能會錯誤地使用該庫,但沒有文檔不告訴我,否則。我也嘗試了其他bs4函數,例如find_all,但是很多函數不返回任何其他結果。任何類型的幫助將不勝感激,乾杯!爲什麼Beautifulsoup顯示不必要的字符刮網址
代碼:
import bs4 as bs
from requests import get
query = input('Please Enter Your Topic of intrest: ')
first_part = query.replace(" ", "%20")
second_part = query.replace(" ", "+")
results= "0"
num_of_pages = int(input('How many pages do you want scraped? '))
for i in range(num_of_pages):
results= int(results)
results += 10
gsearch_url = "https://www.google.com/search?q={}#q={}%3F&start={}&*".format(first_part, second_part, results)
sauce = get(gsearch_url)
soup = bs.BeautifulSoup(sauce.text, 'lxml')
for url in soup.select('.r a'):
print(url.get('href'))
返回:
/url?q=http://www.codingdojo.com/blog/9-most-in-demand-programming-languages-of-2016/&sa=U&ved=0ahUKEwja3a21w7fSAhWSZiYKHdLGA9gQFggdMAI&usg=AFQjCNFmDl_1epVQRmDfc4y5MWFeNvrPQg
/url?q=https://fossbytes.com/best-popular-programming-languages-2017/&sa=U&ved=0ahUKEwja3a21w7fSAhWSZiYKHdLGA9gQFgghMAM&usg=AFQjCNEKhYqx1FbKl_Wu-9EoMYd3e9i_Dw
/url?q=http://www.bestprogramminglanguagefor.me/&sa=U&ved=0ahUKEwja3a21w7fSAhWSZiYKHdLGA9gQFggnMAQ&usg=AFQjCNHmbzuLwFo_egaWnbXSOW4p-Fva3g
/url?q=http://www.codingdojo.com/blog/9-most-in-demand-programming-languages-of-2016/&sa=U&ved=0ahUKEwja3a21w7fSAhWSZiYKHdLGA9gQFggyMAU&usg=AFQjCNFmDl_1epVQRmDfc4y5MWFeNvrPQg
etc....
我不明白你的問題,請說明你想要的回報(結果)並正確地呈現你的代碼。 –