2017-09-14 73 views
0

我正在嘗試更新morningstar的網站的基金規模。我以前的IE自動化嘗試沒有成功,所以我切換到XML httpRequest(工作速度也快得多)。現在,我無法從我從網站獲得的文檔中打印出正確的線條。我希望代碼在第一個「td」 - 標籤被稱爲「基金規模(Mil)」的「tr」 - 標籤內給我第三個「td」 - 標籤。所以代碼循環遍歷所有「td」標籤的標題,並且如果發現"{line heading}" = "Fund Size (Mil)"就跳轉到行動。現在這是問題。我不知道如何參考這些標題。我試着將每個「td」-tag設置爲一個變量(在「tr」-tag中有3個「td」 - 標籤,所以我對每個「td-tag」都有變量row1,row2,row3) ),但是當我現在做這個,我得到一個運行時錯誤438:對象不支持此屬性或方法符合VBA中的HTML解析

Debug.Print TDElements.getElementsByTagName("tr")(0).Cells(0).innerHTML 

此外,當我Debug.Print TDElement.innerHTML,我沒有看到我需要的「td」標籤。當我最後放入.innerText時,我會看到所有「td」標籤。

現在的問題是: 1)如何直接引用標題? (請參閱下面For Each循環內的註釋代碼行) 2)什麼原因導致我沒有看到帶有.innerHTML的所有td-tags,但是我用.innerText看到它們?

網址 http://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F0GBR04BKW Excel 2010中,IE版本11

Sub XMLhttpRequestTest2() 

'Microsoft XML, v 6.0 
'Microsoft HTML object library, used in parsing HTML 

Dim myurl As String 
Dim TDElement As Object 
Dim TDElements As IHTMLElementCollection 
Dim IE As MSXML2.XMLHTTP60 

Dim HTMLDoc As MSHTML.HTMLDocument 
Dim HTMLBody As MSHTML.HTMLBody 

Set IE = New MSXML2.XMLHTTP60 
Set HTMLDoc = New MSHTML.HTMLDocument 
Set HTMLBody = HTMLDoc.body 


myurl = "http://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F0GBR04BKW" 
IE.Open "GET", myurl, False 
IE.send 

HTMLBody.innerHTML = IE.responseText 

Set TDElements = HTMLDoc.getElementsByTagName("td") 
    For Each TDElement In TDElements 
     Debug.Print TDElement.innerText '.innerText/.innerHTML. Can't see the fund size with .innerHTML?? 
'  If "{line heading}" = "Fund Size (Mil)" Then 'How can I refer to headings in the html document? 
'   Worksheets("Sheet3").Range("B3") = Split("{line text}", ";")(1) 'reference to line text? 
    Next 


End Sub 

H2SO4的答案固定的上述問題。以下是對最初問題的擴展。

現在TDElement被分配了幾行文本(我將h2so4的值10切換到3,因此代碼在找到基金字符串時顯示接下來的3行)。我怎樣才能進一步解析呢?當前行Worksheets("helper").Cells(x, 6).Value = Split(TDElement.innerText, " ")(1)返回我需要的值(769.28),但如果我真的明白這裏發生了什麼,將來會有幫助。

所以,只是爲了鍛鍊,我將如何獲得所有3行打印在自己的細胞?因此,產出將爲:基金規模(米爾),31/08/2017,769.28單元格(x,6),(x,7),(x,8)。當我嘗試將函數「Split」或「Left」分配給TDElement時,函數只會定位最後一行,而不是上面的其他行。但是,當我Debug.Print TDElement.innerText/.innerHTML,我也看到其他線路。那麼我怎樣才能「訪問」最後一行之上的行呢?

輸出的Debug.Print TDElement.innerText

基金規模(MIL)

31/08/2017

EUR 769.28

回答

0

下面的代碼將讓你的 「基金規模」 行

Sub XMLhttpRequestTest2() 

'Microsoft XML, v 6.0 
'Microsoft HTML object library, used in parsing HTML 

    Dim myurl As String 
    Dim TDElement As Object 
    Dim TDElements As IHTMLElementCollection 
    Dim IE As MSXML2.XMLHTTP60 
    Dim Flag As Boolean 
    Dim HTMLDoc As MSHTML.HTMLDocument 
    Dim HTMLBody As MSHTML.HTMLBody 
    Dim k As Long 
    Set IE = New MSXML2.XMLHTTP60 
    Set HTMLDoc = New MSHTML.HTMLDocument 
    Set HTMLBody = HTMLDoc.body 


    myurl = "http://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F0GBR04BKW" 
    IE.Open "GET", myurl, False 
    IE.send 

    HTMLDoc.body.innerHTML = IE.responseText 
    Flag = False 
    k = 0 
    Set TDElements = HTMLDoc.getElementsByTagName("td") 
    For Each TDElement In TDElements 
     If InStr(TDElement.innerText, "Fund Size") <> 0 Or Flag Then 
      'if fundsize string is found, display the next 10 lines 
      Debug.Print ":" & TDElement.innerText '.innerText/.innerHTML. Can't see the fund size with .innerHTML?? 
      '  If "{line heading}" = "Fund Size (Mil)" Then 'How can I refer to headings in the html document? 
      '   Worksheets("Sheet3").Range("B3") = Split("{line text}", ";")(1) 'reference to line text? 
      k = k + 1 
      If k < 10 Then Flag = True Else Flag = False 
     End If 
    Next 


End Sub 
+0

非常感謝!這工作。我通過聲明一個新的整數變量n並將其設置爲3而不是10(僅針對我需要的行)來修改您的代碼。然而,我想知道爲什麼我只能解析代碼返回的最後一行。請參閱上面最初問題**的**擴展。 – Samppa

0

回答你的擴展,分割a的方法nswer取決於頁面設計的方式。這裏是一個可能的解決方案,讓您的數據在3個不同的單元格中。

Sub XMLhttpRequestTest2() 

'Microsoft XML, v 6.0 
'Microsoft HTML object library, used in parsing HTML 

    Dim myurl As String 
    Dim TDElement As Object 
    Dim TDElements As IHTMLElementCollection 
    Dim IE As MSXML2.XMLHTTP60 
    Dim Flag As Boolean 
    Dim HTMLDoc As MSHTML.HTMLDocument 
    Dim HTMLBody As MSHTML.HTMLBody 
    Dim k As Long, text 
    Set IE = New MSXML2.XMLHTTP60 
    Set HTMLDoc = New MSHTML.HTMLDocument 
    Set HTMLBody = HTMLDoc.body 


    myurl = "http://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F0GBR04BKW" 
    IE.Open "GET", myurl, False 
    IE.send 

    HTMLDoc.body.innerHTML = IE.responseText 
    Flag = False 
    k = 0 
    Set TDElements = HTMLDoc.getElementsByTagName("td") 
    For Each TDElement In TDElements 
     If InStr(TDElement.innerText, "Fund Size") <> 0 Or Flag Then 
      'if fundsize string is found, display the next 10 lines 
      text = Split(TDElement.innerText, vbLf) 
      If text(0) <> "" Then 
       Worksheets("Sheet3").Cells(3, k + 2).Resize(, UBound(text) + 1) = text '.innerText/.innerHTML. Can't see the fund size with .innerHTML?? 
       '  If "{line heading}" = "Fund Size (Mil)" Then 'How can I refer to headings in the html document? 
       '   Worksheets("Sheet3").Range("B3") = Split("{line text}", ";")(1) 'reference to line text? 
      End If 
      k = k + 1 
      If k < 3 Then Flag = True Else Flag = False 
     End If 
    Next 
End Sub