2017-03-06 56 views
0

我可以使用此代碼得到的XPath查詢元素

from lxml import html 
import requests 

page = requests.get('http://monument.pl/pol_m_DESKOROLKA_Deski-162.html') 
tree = html.fromstring(page.content) 
VENDORLISTn = tree.xpath('//a[@class="firm_name"]/text()') 
print VENDORLISTn 

獲得的來自網頁的產品列表我得到以下結果

['Almost', 'Almost', 'Almost', 'Enjoi', 'Real', 'Boulevard', 'Almost', 'Almost', 'Enjoi', 'Enjoi', 'Enjoi', 'Blind', 'Blind', 'Blind', 'Blind', 'Blind', 'Blind', 'Blind', 'Cliche', 'Blind', 'Blind', 'Blind', 'Enjoi', 'Enjoi', 'Enjoi', 'Enjoi', 'Enjoi', 'Enjoi', 'Enjoi', 'Antihero'] 

我怎樣才能得到的路徑,這些元素列表?它可能看起來像這樣:

['//*[@id="search"]/table/tbody/tr[1]/td[1]/div/div[3]/div/a','//*[@id="search"]/table/tbody/tr[1]/td[2]/div/div[3]/div/a',etc.... 

回答

0

VENDORLISTn只是list。我想有沒有辦法產生XPath對於這一點,但你可以得到絕對XPath詳情如下,每一個環節:

from lxml import etree 
from lxml import html 
import requests 

page = requests.get('http://monument.pl/pol_m_DESKOROLKA_Deski-162.html') 
tree = html.fromstring(page.content) 
VENDORLISTn = tree.xpath('//a[@class="firm_name"]') 
for link in VENDORLISTn: 
    etree.ElementTree(tree).getpath(link) 

輸出:

'/html/body/div[1]/div/div[2]/div/div[2]/div/div[7]/table/tr[1]/td[1]/div/div[3] 
/div/a' 
'/html/body/div[1]/div/div[2]/div/div[2]/div/div[7]/table/tr[1]/td[2]/div/div[3] 
/div/a' 
'/html/body/div[1]/div/div[2]/div/div[2]/div/div[7]/table/tr[1]/td[3]/div/div[3] 
/div/a' 
....