我創建了一個按照我的預期收集數據的蜘蛛。我現在面臨的唯一問題是結果有很多重複。不過,我想動搖重複的功能關閉,CSV寫結果:從用python編寫的蜘蛛獲取csv中的重複內容
下面是代碼:
import csv
import requests
from lxml import html
def Startpoint():
global writer
outfile=open('Data.csv','w',newline='')
writer=csv.writer(outfile)
writer.writerow(["Name","Price"])
address = "https://www.sephora.ae/en/stores/"
page = requests.get(address)
tree = html.fromstring(page.text)
titles=tree.xpath('//li[contains(@class,"level0")]')
for title in titles:
href = title.xpath('.//a[contains(@class,"level0")]/@href')[0]
Layer2(href)
def Layer2(address):
global writer
page = requests.get(address)
tree = html.fromstring(page.text)
titles=tree.xpath('//li[contains(@class,"amshopby-cat")]')
for title in titles:
href = title.xpath('.//a/@href')[0]
Endpoint(href)
def Endpoint(address):
global writer
page = requests.get(address)
tree = html.fromstring(page.text)
titles=tree.xpath('//div[@class="product-info"]')
for title in titles:
Name = title.xpath('.//div[contains(@class,"h3")]/a[@title]/text()')[0]
Price = title.xpath('.//span[@class="price"]/text()')[0]
metco=(Name,Price)
print(metco)
writer.writerow(metco)
Startpoint()
尋求調試幫助的問題(「爲什麼這個代碼不工作?」)必須包含所需的行爲,特定的問題或錯誤以及在問題本身中重現問題所需的最短代碼。沒有明確問題陳述的問題對其他讀者無益。請參閱:如何創建最小,完整和可驗證示例。 – DyZ