我想要的只是颳去所有的產品。爲什麼我也不能使用containers.div?當我的教程只有<div></div>
時,我很困惑<div><\div><div>
。爲什麼我不能調用container.findAll(「h3」,{「class」:「name」})?
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url = 'https://hbx.com/categories/sneakers'
# membuka koneksi, mengambil halaman
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
# html parsing
page_soup = soup(page_html, "html.parser")
# mengambil masing2 produk
containers = page_soup.findAll("div",{"class":"product-wrapper col-xs-6 col-sm-4"})
filename = "kontol.csv"
f = open(filename, "w")
headers = "judul, brand, harga\n"
f.write(headers)
for container in containers:
title_container = container.findAll("h3", {"class":"name"})
judul = title_container[0].text
brand_container = container.findAll("h4", {"class":"brand"})
brand = brand_container[0].text
price_container = container.findAll("span", {"class":"regular-price"})
harga = price_container[0].text
print("judul: " + judul)
print("brand: " + brand)
print("harga: " + harga)
f.write(judul + "," + brand + "," + harga + "\n")
f.close()
當我嘗試使用container.findAll( 「H3」,{ 「級」: 「名字」})調用我得到這個錯誤
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python36\lib\site-packages\bs4\element.py", line 1807, in __getattr__
"ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?" % key
AttributeError: ResultSet object has no attribute 'findAll'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
我自己的電腦上運行此代碼後,它好像你將有一些問題刮使用的urllib本網站這個數據。看起來好像很多內容是使用JavaScript渲染的,這會使你無法使用urllib來刮擦它。我會建議看看使用硒來解決這個問題:http://selenium-python.readthedocs.io/。 –