2015-09-26 26 views
0

我想從鏈接列表中下載多個文件。我找到鏈接的網站受到保護。這就是爲什麼我想使用IE(使用當前會話/ cookie)。每個鏈接的目標是一個xml文件。這些文件太大而無法打開然後保存。所以我需要直接保存它們(右鍵單擊,保存目標爲)。VBA宏從IE中的鏈接下載多個文件

的鏈接列表如下:

<html> 
<body> 
<p> <a href="https://example.com/report?_hhhh=XML"Link A</a><br>> </p> 
<p> <a href="https://example.com/report?_aaaa=XML"Link B</a><br>> </p> 
... 
</body> 
</html> 

我想通過各個環節循環,節省每一個目標。目前我遇到了「另存爲」的問題。我真的不知道該怎麼做。這是我的代碼到目前爲止:

Sub DownloadAllLinks() 

Dim IE As Object 
Dim Document As Object 
Dim List As Object 
Dim Link As Object 

' Before I logged in to the website 
Set IE = CreateObject("InternetExplorer.Application") 
IE.Visible = True 
IE.Navigate ("https:\\......\links.html") 

Do While IE.Busy 
    DoEvents 
Loop 

' Detect all links on website 
Set Document = IE.Document 
Set List = Document.getElementsByTagName("a") 

' Loop through all links to download them 

For Each Link In List 

' Now I need to automate "save target as"/right-click and then "save as" 
... 

Next Link 
End Sub 

你有任何想法自動化「另存爲」每個鏈接?

任何幫助表示讚賞。非常感謝, 烏利

+0

這是一個兔子洞,我已經倒了很多次。簡短的回答是停止嘗試讓IE充當代理來下載文件。使用xmlHttp對象通過GetResponseHeader登錄並收集/返回認證,然後使用ADO流保存文件。 – Jeeped

+0

[This](http://stackoverflow.com/a/32429348/2165759)可能會有所幫助。 – omegastripes

回答

0

下面是我適合你的情況很常見的例子,它顯示了XHR和正則表達式的使用檢索網頁的HTML內容,從中提取的所有鏈接,並下載各個環節的目標文件:

Option Explicit 

Sub Test() 
    ' declare vars 
    Dim sUrl As String 
    Dim sReqProt As String 
    Dim sReqAddr As String 
    Dim sReqPath As String 
    Dim sContent As String 
    Dim oLinks As Object 
    Dim oMatch As Object 
    Dim sHref As String 
    Dim sHrefProt As String 
    Dim sHrefAddr As String 
    Dim sHrefPath As String 
    Dim sHrefFull As String 
    Dim n As Long 
    Dim aContent() As Byte 
    ' set source URL 
    sUrl = "https:\\......\links.html" 
    ' process source URL 
    SplitUrl sUrl, sReqProt, sReqAddr, sReqPath 
    If sReqProt = "" Then sReqProt = "http:" 
    sUrl = sReqProt & "//" & sReqAddr & "/" & sReqPath 
    ' retrieve source page HTML content 
    With CreateObject("Microsoft.XMLHTTP") 
     .Open "GET", sUrl, False 
     .Send 
     sContent = .ResponseText 
    End With 
    ' parse source page HTML content to extract all links 
    Set oLinks = CreateObject("Scripting.Dictionary") 
    With CreateObject("VBScript.RegExp") 
     .Global = True 
     .MultiLine = True 
     .IgnoreCase = True 
     .Pattern = "<a.*?href *= *(?:'|"")(.*?)(?:'|"").*?>" 
     For Each oMatch In .Execute(sContent) 
      sHref = oMatch.subMatches(0) 
      SplitUrl sHref, sHrefProt, sHrefAddr, sHrefPath 
      If sHrefProt = "" Then sHrefProt = sReqProt 
      If sHrefAddr = "" Then sHrefAddr = sReqAddr 
      sHrefFull = sHrefProt & "//" & sHrefAddr & "/" & sHrefPath 
      oLinks(oLinks.Count) = sHrefFull 
     Next 
    End With 
    ' save each link target into file 
    For Each n In oLinks 
     sHref = oLinks(n) 
     With CreateObject("Microsoft.XMLHTTP") 
      .Open "GET", sHref, False 
      .Send 
      aContent = .ResponseBody 
     End With 
     With CreateObject("ADODB.Stream") 
      .Type = 1 ' adTypeBinary 
      .Open 
      .Write aContent 
      .SaveToFile "C:\Test\" & n & ".xml", 2 ' adSaveCreateOverWrite 
      .Close 
     End With 
    Next 
End Sub 

Sub SplitUrl(sUrl, sProt, sAddr, sPath) 
    ' extract protocol, address and path from URL 
    Dim aSplit 
    aSplit = Split(sUrl, "//") 
    If UBound(aSplit) = 0 Then 
     sProt = "" 
     sAddr = sUrl 
    Else 
     sProt = aSplit(0) 
     sAddr = aSplit(1) 
    End If 
    aSplit = Split(sAddr, "/") 
    If UBound(aSplit) = 0 Then 
     sPath = sAddr 
     sAddr = "" 
    Else 
     sPath = Mid(sAddr, Len(aSplit(0)) + 2) 
     sAddr = aSplit(0) 
    End If 
End Sub 

此方法不採用IE自動化。通常情況下,IE的Cookie的過程足以引用當前會話,因此如果您的網站不使用附加程序進行身份驗證並生成鏈接列表,那麼該方法應該適用於您。