2016-04-24 112 views
2

我正試圖在此網站上獲取包包的名稱 - http://www.barneys.com/barneys-new-york/women/bags。 到目前爲止,我有這樣的代碼:網頁抓取從網頁上提取產品名稱

from urllib.request import urlopen 
    from bs4 import BeautifulSoup 
    url="http://www.barneys.com/barneys-new-york/women/bags" 
    html = urlopen(url) 
    bsObj = BeautifulSoup(html.read(),"html.parser") 
    product_name = bsObj.findAll("a",{"class":"name-link"}) 
    print(product_name) 

我試過renderContents()和get_text(),但他們給我的錯誤(AttributeError的)。

+2

可不可以給法正確一個格式化[mcve]? – jonrsharpe

回答

1

的名稱是在產品名稱的div:

from bs4 import BeautifulSoup 
import requests 

soup = BeautifulSoup(requests.get("http://www.barneys.com/barneys-new-york/women/bags").content) 

print([prod.text.strip() for prod in soup.select("div.product-name")]) 

它給你:

['Lizard iPhone® 6 Plus Case', 'Lizard iPhone® 6 Case', 'Peekaboo Large Satchel', 'Embellished Shoulder Bag', 'Rockstud Reversible Tote', 'Hava Shoulder Bag', 'Rockstud Crossbody', 'PS1 Tiny Shoulder Bag', 'Faye Medium Shoulder Bag', 'Rockstud Crossbody', 'Large Shopper Tote', 'Flat Clutch', 'Wicker Small Crossbody', 'Jotty Duffel', 'P.Y.T. Shoulder Bag', 'Hadley Baby Satchel', 'Beckett Small Crossbody', 'Squarit PM Satchel', 'Double Baguette Micro', 'City Victoria Small Satchel', 'Large Zip Pouch', 'Jotty Duffel', 'Jen Small Crossbody', 'Mini Trouble Shoulder Bag', 'Midi Clutch', 'Midi Clutch', 'Two For One Pouch 10', 'Guitar Rockstud Medium Backpack', 'Embellished Large Messenger', 'Papier A4 Side-Zip Tote', 'Nightingale Micro-Satchel', 'Hand-Carved Atlas Clutch', 'Emerald-Cut Minaudière', 'Trouble II Shoulder Bag', 'Intrecciato Olimpia Small Shoulder Bag', 'Rockstud Large Tote', 'Baguette Micro', 'Bindu Small Clutch', 'Emerald-Cut Minaudière', 'Gotham City Hobo', 'Brillant Sellier PM Satchel', 'Flight Weekender Duffel', 'Sac Mesh Bucket Bag', 'Seema Small Satchel', 'Madison Shoulder Bag', 'Sporty Smiley Crossbody', 'Monogram Large Wallet', 'Monogram Card Case'] 

如果你想要的所有信息,你可以用thumb-link從錨標記得到它類內部編號主要

print(soup.select("#primary a.thumb-link")) 

,讓你喜歡的輸出:

<a class="thumb-link" href="http://www.barneys.com/vianel-lizard-iphone%C2%AE-6-plus-case-504475332.html" title="Lizard iPhone® 6 Plus Case"> 
<img alt="Vianel Lizard iPhone® 6 Plus Case" class="gridImg" data-image-alter="http://product-images.barneys.com/is/image/Barneys/504475332_2_detail?$grid_new_fixed$" data-original="http://product-images.barneys.com/is/image/Barneys/504475332_1_tabletop?$grid_new_fixed$" height="370" onerror="this.src='http://demandware.edgesuite.net/aasv_prd/on/demandware.static/Sites-BNY-Site/-/default/dwd89468c5/images/browse_placeholder_image.jpg'" title="Lizard iPhone® 6 Plus Case" width="231"/> 
<noscript> 
<img alt="Vianel Lizard iPhone® 6 Plus Case" src="http://product-images.barneys.com/is/image/Barneys/504475332_1_tabletop?$grid_new_fixed$" title="Lizard iPhone® 6 Plus Case?$grid_new_fixed$"/> 
</noscript> 

您可以分析圖像,標題等。從每一個返回。

使用你自己的代碼,你需要訪問的.text屬性如上:

product_name = [a.text.strip() for a in bsObj.findAll("a",{"class":"name-link"})] 
print(product_name) 

,這將給你一樣的第一選擇:

['Lizard iPhone® 6 Plus Case', 'Lizard iPhone® 6 Case', 'Peekaboo Large Satchel', 'Embellished Shoulder Bag', 'Rockstud Reversible Tote', 'Hava Shoulder Bag', 'Rockstud Crossbody', 'PS1 Tiny Shoulder Bag', 'Faye Medium Shoulder Bag', 'Rockstud Crossbody', 'Large Shopper Tote', 'Flat Clutch', 'Wicker Small Crossbody', 'Jotty Duffel', 'P.Y.T. Shoulder Bag', 'Hadley Baby Satchel', 'Beckett Small Crossbody', 'Squarit PM Satchel', 'Double Baguette Micro', 'City Victoria Small Satchel', 'Large Zip Pouch', 'Jotty Duffel', 'Jen Small Crossbody', 'Mini Trouble Shoulder Bag', 'Midi Clutch', 'Midi Clutch', 'Two For One Pouch 10', 'Guitar Rockstud Medium Backpack', 'Embellished Large Messenger', 'Papier A4 Side-Zip Tote', 'Nightingale Micro-Satchel', 'Hand-Carved Atlas Clutch', 'Emerald-Cut Minaudière', 'Trouble II Shoulder Bag', 'Intrecciato Olimpia Small Shoulder Bag', 'Rockstud Large Tote', 'Baguette Micro', 'Bindu Small Clutch', 'Emerald-Cut Minaudière', 'Gotham City Hobo', 'Brillant Sellier PM Satchel', 'Flight Weekender Duffel', 'Sac Mesh Bucket Bag', 'Seema Small Satchel', 'Madison Shoulder Bag', 'Sporty Smiley Crossbody', 'Monogram Large Wallet', 'Monogram Card Case']