2017-08-07 55 views
1

我已經寫了一個VBA,它使用硒鉻驅動程序打開一個Web鏈接來抓取數據,我得到了幾個問題,我需要你們對你們的建議。VBA Selenium FindElementByXPath找不到元素

代碼示例和結果1: 在錯誤actived

Sub test_supplements_store() 
    Dim driver As New ChromeDriver 
    Dim post As Object 

    i = 1 

    driver.Get "https://www.thesupplementstore.co.uk/brands/optimum_nutrition?page=4" 
On Error Resume Next 
    For Each post In driver.FindElementsByClass("desc") 
     Cells(i, 1) = post.FindElementByTag("a").Attribute("title") 
     Cells(i, 2) = Trim(Split(post.FindElementByClass("size").Text, ":")(1)) 
     Cells(i, 3) = post.FindElementByXPath(".//span[@class='now']//span[@class='pricetype-purchase-unit multi-price']//span[@class='blu-price blu-price-initialised']").Text 
     Cells(i, 4) = post.FindElementByTag("a").Attribute("href") 
     i = i + 1 
    Next post 
End Sub 

enter image description here

代碼示例和結果2:在錯誤停用

Sub test_supplements_store() 
    Dim driver As New ChromeDriver 
    Dim post As Object 

    i = 1 

    driver.Get "https://www.thesupplementstore.co.uk/brands/optimum_nutrition?page=4" 
'On Error Resume Next 
    For Each post In driver.FindElementsByClass("desc") 
     Cells(i, 1) = post.FindElementByTag("a").Attribute("title") 
     Cells(i, 2) = Trim(Split(post.FindElementByClass("size").Text, ":")(1)) 
     Cells(i, 3) = post.FindElementByXPath(".//span[@class='now']//span[@class='pricetype-purchase-unit multi-price']//span[@class='blu-price blu-price-initialised']").Text 
     Cells(i, 4) = post.FindElementByTag("a").Attribute("href") 
     i = i + 1 
    Next post 
End Sub 

enter image description here 代碼示例和結果3:在錯誤激活

Sub test_supplements_store() 
    Dim driver As New ChromeDriver 
    Dim post As Object 

    i = 1 

    driver.Get "https://www.thesupplementstore.co.uk/brands/optimum_nutrition" 
On Error Resume Next 
    For Each post In driver.FindElementsByClass("desc") 
     Cells(i, 1) = post.FindElementByTag("a").Attribute("title") 
     Cells(i, 2) = Trim(Split(post.FindElementByClass("size").Text, ":")(1)) 
     Cells(i, 3) = post.FindElementByXPath(".//span[@class='now']//span[@class='pricetype-purchase-unit multi-price']//span[@class='blu-price blu-price-initialised']").Text 
     Cells(i, 4) = post.FindElementByTag("a").Attribute("href") 
     i = i + 1 
    Next post 
End Sub 

enter image description here

第一個例子返回所有從該網站74項除了價格,但在很長的時間大約兩分鐘時間。

第二個示例僅將標題返回到工作表的第一個單元格並彈出錯誤。

第三個示例僅返回21,但錯過了沒有現在標籤的商品的退貨價格。腳本運行速度非常快,不到10秒。

請諮詢如何將所有74個項目返回到標題,大小,價格,href。

+0

你得到了什麼確切的錯誤? StaleElement? –

+0

我不確定你是什麼意思,因爲錯誤快照附加到第二個例子。第一個和第三個示例不會返回任何錯誤。 – Martin

+1

好的謝謝。我沒有在VB上工作,但這是我用來克服java中過時的方法。 https://stackoverflow.com/questions/45434381/stale-object-reference-while-navigation-using-selenium/45435158#45435158 –

回答

1

您正在處理的頁面已經應用了放置加載方法。這是因爲所有項目一次不加載;相反,當您向下滾動時,它會加載其餘部分。我在代碼中使用了一個小的JavaScript函數,它解決了這個問題。我希望這是你所尋找的結果。

Sub test_supplements_store() 
    Dim driver As New ChromeDriver 
    Dim post As Object 

    driver.Get "https://www.thesupplementstore.co.uk/brands/optimum_nutrition" 
    On Error Resume Next 

    Do While EndofPage = False 
     PrevPageHeight = CurrentPageHeight 
     CurrentPageHeight = driver.ExecuteScript("window.scrollTo(0, document.body.scrollHeight);var CurrentPageHeight=document.body.scrollHeight;return CurrentPageHeight;") 
     driver.Wait 3000 
     If PrevPageHeight = CurrentPageHeight Then 
      EndofPage = True 
     End If 
    Loop 

    For Each post In driver.FindElementsByXPath("//li[contains(@class,'prod')]") 
     i = i + 1: Cells(i, 1) = post.FindElementByXPath(".//a").Attribute("title") 
     Cells(i, 2) = Split(post.FindElementByXPath(".//p[@class='size']").Text, ": ")(1) 
     Cells(i, 3) = post.FindElementByXPath(".//p[@class='price']//span[@class='now']//span|.//p[@class='price']//span[@class='dynamictype-single']").Text 
     Cells(i, 4) = post.FindElementByXPath(".//a").Attribute("href") 
    Next post 
End Sub 
+0

你有另一個我沒有注意到的要求。使用xpath將解決價格問題。 – SIM

+0

不幸的是,您的代碼只返回頁面的第21項的價格。此外,我不確定如何一起返回該項目的正常和新價格。 – Martin

+0

我沒有調整你的價格部分。我試圖獲得所有74個項目。 – SIM