我想從一個特定的網頁使用Python的所有鏈接

我想能夠從下面的網頁使用python https://yeezysupply.com/pages/all拉動所有的URL我試着用我發現的一些其他建議，但他們似乎並沒有與這個特定的工作網站。我最終根本找不到任何網址。我想從一個特定的網頁使用Python的所有鏈接

import urllib 
import lxml.html 
connection = urllib.urlopen('https://yeezysupply.com/pages/all') 

dom = lxml.html.fromstring(connection.read()) 

for link in dom.xpath('//a/@href'): 
    print link

來源

2017-06-06 Josh Bijari

頁面源代碼中沒有鏈接;在頁面加載到瀏覽器中後，它們使用Javascript插入。

來源

2017-06-06 00:32:50

也許你會利用專門爲此設計的模塊。繼承人快速和骯髒的腳本，獲取頁面

#!/usr/bin/python3 

import requests, bs4 

res = requests.get('https://yeezysupply.com/pages/all') 

soup = bs4.BeautifulSoup(res.text,'html.parser') 
links = soup.find_all('a') 

for link in links: 
    print(link.attrs['href'])

上的相關鏈接它會產生這樣的輸出：

/pages/jewelry 
/pages/clothing 
/pages/footwear 
/pages/all 
/cart 
/products/womens-boucle-dress-bleach/?back=%2Fpages%2Fall 
/products/double-sleeve-sweatshirt-bleach/?back=%2Fpages%2Fall 
/products/boxy-fit-zip-up-hoodie-light-sand/?back=%2Fpages%2Fall 
/products/womens-boucle-skirt-cream/?back=%2Fpages%2Fall 
etc...

這是你在找什麼？請求和美麗的湯是令人驚歎的刮刮刀。

來源

2017-06-06 00:45:01 Nalaurien

是的，謝謝這正是我一直在尋找 –

我想從一個特定的網頁使用Python的所有鏈接

回答

相關問題