以編程方式下載通過PHP頁面推送的文件

某些PHP站點使用頁面充當處理文件下載的中間人。以編程方式下載通過PHP頁面推送的文件

使用瀏覽器可以透明地工作。在php頁面處理請求時，似乎有一點暫停。

但是，嘗試使用URL或HttpURLConnection通過Java下載將返回一個純html頁面。我怎樣才能以相同的方式使文件下載工作？

編輯：下面是一個例子鏈接：

http://depot.eice.be/index.php?annee_g=jour&cours=poo

編輯：下面是一些我一直在測試代碼：

// This returns an HTML page 

private void downloadURL(String theURL) { 
    URL url; 
    InputStream is = null; 
    DataInputStream dis; 
    String s; 
    StringBuffer sb = new StringBuffer(); 

    try { 
     url = new URL(theURL); 

     HttpURLConnection conn = (HttpURLConnection) url.openConnection(); 

     conn.setRequestMethod("GET"); 
     conn.connect(); 

     if (conn.getResponseCode()!=HttpURLConnection.HTTP_OK) 
      return; 


     InputStream in = conn.getInputStream(); 

     ByteArrayOutputStream bos = new ByteArrayOutputStream(); 

     int i; 
     while ((i = in.read()) != -1) { 
      bos.write(i); 
     } 

     byte[] b = bos.toByteArray(); 

     FileOutputStream fos = new FileOutputStream(getNameFromUrl(theURL)); 
     fos.write(b); 
     fos.close(); 
     conn.disconnect(); 
    } catch (MalformedURLException e) { 
     // TODO Auto-generated catch block 
     e.printStackTrace(); 
    } 
    catch (IOException e) { 
     // TODO Auto-generated catch block 
     e.printStackTrace(); 
    } 
}

// This will throw Exceptions if the URL isn't in the expected format 

public String getNameFromUrl(String url) { 

    int slashIndex = url.lastIndexOf('/'); 
    int dotIndex = url.lastIndexOf('.'); 

    System.out.println("url:" + url + "," + slashIndex + "," + dotIndex); 

    if (dotIndex == -1) { 
     return url.substring(slashIndex + 1); 
    } else { 
     try { 
      return url.substring(slashIndex + 1, url.length()); 
     } catch (StringIndexOutOfBoundsException e) { 
      return ""; 

     } 
    } 
}

來源

2012-04-23 James P.

沒有足夠的信息。該頁面是否需要一些認證？它使用cookie嗎？你嘗試遵循重定向嗎？ – 2012-04-23 19:05:21

你的意思是'Sourceforge.net'上每月項目的下載鏈接。在Sourceforge的情況下，開始頁面上的下載按鈕使用title屬性' ...'當您顯示爲'project.iso'時將鼠標移動到按鈕上，但它實際上是鏈接到html頁面的鏈接。在這種情況下，您只需按照鏈接並在下載頁面上搜索正確的鏈接即可。下載頁面本身包含一個''，它將瀏覽器重定向到下載。 – andih 2012-04-23 19:33:20

@EugeneRetunsky沒有身份驗證，沒有cookie。鏈接公開了一個php頁面（例如：download.php？f = ...），它充當中間人。我想知道的是如何在瀏覽器遇到這樣的鏈接時重現瀏覽器的行爲。 – 2012-04-25 10:13:23

我想我已經找到了使用HttpUnit的一個解決方案。如果您希望看到如何處理，框架的來源可用。

public void downloadURL(String url) throws IOException { 

    WebConversation wc = new WebConversation(); 
    WebResponse indexResp = wc.getResource(new GetMethodWebRequest(url)); 
    WebLink[] links = new WebLink[1]; 
    try { 
     links = indexResp.getLinks(); 
    } catch (SAXException ex) { 
     // Log 
    } 

    for (WebLink link : links) { 
     try { 
      link.click(); 
     } catch (SAXException ex) { 
      // Log 
     } 
     WebResponse resp = wc.getCurrentPage(); 
     String fileName = resp.getURL().getFile(); 
     fileName = fileName.substring(fileName.lastIndexOf("/") + 1); 
     System.out.println("filename:" + fileName); 
     File file = new File(fileName); 
     BufferedInputStream bis = new BufferedInputStream(
       resp.getInputStream()); 
     BufferedOutputStream bos = new BufferedOutputStream(
       new FileOutputStream(file.getName())); 
     int i; 
     while ((i = bis.read()) != -1) { 
      bos.write(i); 
     } 
     bis.close(); 
     bos.close(); 
    } 
    System.out.println("Done downloading."); 
}

來源

2012-04-25 12:31:50