無法提取PDF文件作爲二進制數據

我試圖獲取從PDF文件：無法提取PDF文件作爲二進制數據

網址：https://domain_name/xyz/_id/download/

其中它不指向一個直接的PDF文件和每個獨特的文件被下載解釋一個特定的012字段。

我把瀏覽器和PDF文件的地址欄這個環節被立即下載，而當我試圖通過HttpsURLConnection的它的內容類型是「text/html的」形式，而應該把它牽在'application/pdf'中。

我連接，但文件總是「text/html的」形式獲取下載之前還試圖「調用setRequestProperty」到「應用程序/ PDF」。

方法我使用的是「GET」

1）我需要使用的HttpClient，而不是HttpsURLConnection的嗎？

2）這些類型的鏈接是用來提高安全性嗎？

3）請指出我的錯誤。

4）如何知道服務器上存在的文件名？

我下面主要代碼粘貼，我已經實現了：

URL url = new URL(sb.toString()); 

    //created new connection 
    HttpsURLConnection urlConnection = (HttpsURLConnection) url.openConnection(); 

    //have set the request method and property 
    urlConnection.setRequestMethod("GET"); 
    urlConnection.setDoOutput(true); 
    urlConnection.setRequestProperty("Content-Type", "application/pdf"); 

    Log.e("Content Type--->", urlConnection.getContentType()+" "+ urlConnection.getResponseCode()+" "+ urlConnection.getResponseMessage()+"    "+urlConnection.getHeaderField("Content-Type")); 

    //and connecting! 
    urlConnection.connect(); 

    //setting the path where we want to save the file 
    //in this case, going to save it on the root directory of the 
    //sd card. 
    File SDCardRoot = Environment.getExternalStorageDirectory(); 

    //created a new file, specifying the path, and the filename 

    File file = new File(SDCardRoot,"example.pdf"); 

    if((Environment.getExternalStorageState()).equals(Environment.MEDIA_MOUNTED_READ_ONLY)) 

    //writing the downloaded data into the file we created 
    FileOutputStream fileOutput = new FileOutputStream(file); 

    //this will be used in reading the data from the internet 
    InputStream inputStream = urlConnection.getInputStream(); 

    //this is the total size of the file 
    int totalSize = urlConnection.getContentLength(); 

    //variable to store total downloaded bytes 
    Log.e("Total File Size ---->", ""+totalSize); 
    int downloadedSize = 0; 

    //create a buffer... 
    byte[] buffer = new byte[1024]; 
    int bufferLength = 0; //used to store a temporary size of the buffer 

    //Reading through the input buffer and write the contents to the file 
    while ((bufferLength = inputStream.read(buffer)) > 0) { 

     //add the data in the buffer to the file in the file output stream (the file on the sd card 
     fileOutput.write(buffer, 0, bufferLength); 


     //adding up the size 
     downloadedSize += bufferLength; 

     //reporting the progress: 
     Log.e("This much downloaded---->",""+ downloadedSize); 

    } 
    //closed the output stream 
    fileOutput.close();

我尋覓了很多，無法得到結果。如果可能，請嘗試詳細說明我的錯誤，因爲我第一次實施這個的事情。

**試圖直接讀取PDF鏈接，如：http://labs.google.com/papers/bigtable-osdi06.pdf ，他們很容易被下載的，而且他們的 'Content-Type的' 也是 '應用程序/ PDF' **

感謝。

來源

2011-03-10 iabhi

您是否檢查過服務器響應的MIME類型？ – 2011-03-10 08:05:41

此主題讓我對我的問題的解決方案！當您嘗試從WebView下載流式PDF時，如果您使用HttpURLConnection，則還需要從Web視圖中傳遞Cookie。

String cookie = CookieManager.getInstance().getCookie(url.toString()); 
if (cookie != null) connection.setRequestProperty("cookie", cookie);

來源

2012-11-23 14:00:16 Predders

理論1：服務器響應的內容類型不正確。如果服務器代碼是由您編寫和部署的，請檢查該代碼。

理論2：網址傳回其中有一些JavaScript重定向哪個頁面實際PDF文件的URL的HTML頁面。

來源

2011-03-10 08:18:50 Nishan

我試圖打開的URL有一些內嵌的pdf渲染，其中顯示嵌入在網頁中的pdf文件。你認爲這可能是一個問題嗎？因爲，當我使用Firefox瀏覽器在WebPage中呈現它時，但是當我在Chrome瀏覽器中打開此鏈接時，它會下載該文件。那麼，有什麼我可以做的，直接獲取PDF格式爲二進制而不是接收'HTML /文本'或修改需要在服務器端進行。我沒有部署服務器代碼。 – iabhi 2011-03-10 14:39:02

@ al-sutton @nishan我已經通過FireBug進行了檢查，顯示它爲application/pdf對象。那麼，我需要做一些改變來訪問網頁中的嵌入式pdf嗎？ – iabhi 2011-03-11 05:50:16

此外，我可以下載PDF的確切文件大小，但在'text/html'中，而不是將其作爲'application/pdf'接收，因此它顯示「無法打開文本/ html文件類型」 – iabhi 2011-03-11 05:58:02

無法提取PDF文件作爲二進制數據

回答

相關問題