2016-11-25 115 views
4

比方說,我有一個頁面,如下所示,保存在C:\ TEMP \ html_page.html:讀取和操作HTML與Excel VBA

<html> 
    <head> 
     <link rel="stylesheet" href="styles.css"> 
    </head> 
    <body> 
     <div id="xxx1"> 
     <img src="test.png"> 
     </div> 
    </body> 
</html> 

我想以編程方式調整IMG src屬性,基於Excel數據& VBA。基本上可以通過Xpath找到div,並調整其中包含的(單個)img標記。

我發現了一個使用VBA通過XML庫here操作XML的示例,但我一直在討論如何使用HTML對象庫進行此項工作;找不到任何示例和/或文檔。

Dim XDoc As Object, root As Object 

Set XDoc = CreateObject("MSXML2.DOMDocument") 
XDoc.async = False: XDoc.validateOnParse = False 

If XDoc.Load(html_path) Then 
    Debug.Print "Document loaded" 
Else 
    Dim strErrText As String 
    Dim xPE As MSXML2.IXMLDOMParseError 
    ' Obtain the ParseError object 
    Set xPE = XDoc.parseError 
    With xPE 
     strErrText = "Your XML Document failed to load" & _ 
     "due the following error." & vbCrLf & _ 
     "Error #: " & .ErrorCode & ": " & xPE.reason & _ 
     "Line #: " & .Line & vbCrLf & _ 
     "Line Position: " & .linepos & vbCrLf & _ 
     "Position In File: " & .filepos & vbCrLf & _ 
     "Source Text: " & .srcText & vbCrLf & _ 
     "Document URL: " & .URL 
    End With 
    MsgBox strErrText, vbExclamation 

所有我想要做的是:

'... 
Set outer_div = XDoc.SelectFirstNode("//div[id='xxx1'") 
... edit the img attribute 

但我不能加載HTML頁面,因爲它不是正確的XML(img標籤未閉)。

任何幫助,非常感謝。哦,我不能使用其他語言,比如Python,無賴。

回答

3

這不是你想要的,但它可能已經足夠接近了。而不是使用XML庫,使用HTML庫:

Sub changeImg() 

    Dim dom As Object 
    Dim img As Object 
    Dim src As String 

    Set dom = CreateObject("htmlFile") 

    Open "C:\temp\test.html" For Input As #1 
     src = Input$(LOF(1), 1) 
    Close #1 

    dom.body.innerHTML = src 

    Set img = dom.getelementsbytagname("img")(0) 

    img.src = "..." 

    Open "C:\temp\test.html" For Output As #1 
     Print #1, dom.DocumentElement.outerHTML 
    Close #1 


End Sub 

的問題是,生成的文件會添加Head節點和標記名稱將是大寫的。如果你能忍受這一點,解決方案將爲你工作。

另外,如果您想更深入地做一些事情,選擇更好的選擇器會考慮早期綁定。暴露的HTML界面比界面不同的,當後期綁定,並支持更多的特性 - 你要添加一個引用到HTML Object Library

Sub changeImg() 

    Dim dom As HTMLDocument 
    Dim img As Object 
    Dim src As String 

    Set dom = CreateObject("htmlFile") 

    Open "C:\temp\test.html" For Input As #1 
     src = Input$(LOF(1), 1) 
    Close #1 

    dom.body.innerHTML = src 

    Set img = dom.getelementsbytagname("img")(0) 

    img.src = "..." 

    Open "C:\temp\test.html" For Output As #1 
     Print #1, dom.DocumentElement.outerHTML 
    Close #1 


End Sub 
+0

非常感謝!似乎我幾乎在那裏:問題不是100%準確。我正在尋找適用於多行HTML文件的解決方案。我試圖找到如何調整代碼,但尚未成功。你介意加入這個答案嗎? – MattV

+0

@MattV,抱歉,我一定錯過了一些東西,爲什麼這不適用於多行文件?讓我知道,我會更新 – SWa

0

爲了這個目的,你可以使用doc.querySelector("div[id='xxx1'] img")。要更改src屬性,請使用img.setAttribute "src", "new.png"。 HTH

Option Explicit 

' Add reference to Microsoft Internet Controls (SHDocVw) 
' Add reference to Microsoft HTML Object Library 

Sub Demo() 
    Dim ie As SHDocVw.InternetExplorer 
    Dim doc As MSHTML.HTMLDocument 
    Dim url As String 

    url = "file:///C:/Temp/StackOverflow/html/html_page.html" 
    Set ie = New SHDocVw.InternetExplorer 
    ie.Visible = True 
    ie.navigate url 
    While ie.Busy Or ie.readyState <> READYSTATE_COMPLETE: DoEvents: Wend 
    Set doc = ie.document 

    Dim img As HTMLImg 
    Set img = doc.querySelector("div[id='xxx1'] img") 
    If Not img Is Nothing Then 
     img.setAttribute "src", "new.png" 
    End If 
    ie.Quit 
End Sub