2017-04-22 68 views
1

我試着去從一個頁面中的聯繫方式阻礙網頁解析的電話號碼,但是當我運行我的腳本只抓住每個類別的第一部分,而忽略,因爲有些BR標籤的其餘部分,如從聯繫人詳細信息類別中,它只抓取名稱而不是電話號碼或傳真。希望有人會給我任何想法,我怎麼能得到那個?以下是我試過:不能從<br>標籤

Sub RestData() 
Dim http As New MSXML2.XMLHTTP60 
Dim html As New HTMLDocument 
Dim ele As Object, post As Object 

With CreateObject("MSXML2.serverXMLHTTP") 
    .Open "GET", "http://www.austrade.gov.au/SupplierDetails.aspx?ORGID=ORG0120000508&folderid=1736", False 
    .send 
    html.body.innerHTML = .responseText 
End With 
Set ele = html.getElementsByClassName("contact-details block dark")(0).getElementsByTagName("p") 
    For Each post In ele 
     x = x + 1 
     Cells(x, 1) = post.innerText 
    Next post 

Set html = Nothing: Set ele = Nothing: Set docs = Nothing 
End Sub 

HTML元素:

<p>Company Name: Vaucraft Braford Stud<br>Phone: +61 7 4942 4859<br>Fax: +61 7 4942 0618<br>Email: <a href="mailto:[email protected]">[email protected]</a><br>Web: <a target="_blank" href="http://www.vaucraftbrafords.com.au">http://www.vaucraftbrafords.com.au</a></p> 

回答

1

你可以試試這樣的事情...

Sub RestData() 
Dim http As New MSXML2.XMLHTTP60 
Dim html As New HTMLDocument 
Dim ele As Object, post As Object 
Dim TypeDetails() As String 
Dim TypeDetail() As String 
Dim i As Long, r As Long 
With CreateObject("MSXML2.serverXMLHTTP") 
    .Open "GET", "http://www.austrade.gov.au/SupplierDetails.aspx?ORGID=ORG0120000508&folderid=1736", False 
    .send 
    html.body.innerHTML = .responseText 
End With 
Set ele = html.getElementsByClassName("contact-details block dark")(0).getElementsByTagName("p")(2) 
r = 2 
TypeDetails() = Split(ele.innerText, Chr(10)) 

For i = 0 To UBound(TypeDetails) 
    TypeDetail() = Split(TypeDetails(i), ":") 
    Cells(r, 1) = VBA.Trim(TypeDetail(0)) 
    Cells(r, 2) = VBA.Trim(TypeDetail(1)) 
    r = r + 1 
Next i 

Set html = Nothing: Set ele = Nothing: Set docs = Nothing 
End Sub 
+0

哦,我的上帝,你的寶石一個男子。謝謝先生,這樣一個強大而美妙的解決方案。這對我來說很新,我的意思是你在這裏使用的風格。再次感謝。 – SIM

+0

不客氣!很高興它的工作。謝謝你的稱讚。 :) – sktneer

+1

@ SMth80有兩點需要注意:您可以撥打'createDocumentFromUrl()'直接從獲取的URL的'HTMLDocument'([見這個問題(http://stackoverflow.com/questions/9995257)),擺脫所有的MSXML2.serverXmlHttp的東西交換。您可以使用'.querySelectorAll(「。contact-details .block .dark p」)'來簡化DOM遍歷。 – Tomalak