2013-03-19 123 views
1

我有以下網址:處理URL用變音符號和其他特殊字符

https://mantis.server.company/download/test/0022450-umlauts_öä_üüü_and_special_chars_%&$#.pdf 

沒有辦法之前,將字符串編碼。 我只需要處理這個字符串(我知道它不是一個有效的URL字符串),以便可以打開位於該路徑後面的文件。

String url = "https://mantis-daun.server.company/download/test/0022450-umlauts_öä_üüü_and_special_chars_%&$#.pdf"; 

try { 
    url = URLDecoder.decode(url, "UTF-8"); 
    URL myConnection = new URL(url); 
    URLConnection connectMe = myConnection.openConnection(); 
    // Only for error processing 
    HttpURLConnection httpConn = (HttpURLConnection) connectMe; 
    InputStream is; 
    if (httpConn.getResponseCode() >= 400) { 
     is = httpConn.getErrorStream(); 
    } else { 
     is = httpConn.getInputStream(); 
    } 
    BufferedReader rd = new BufferedReader(new InputStreamReader(is)); 
     String line; 
     while ((line = rd.readLine()) != null) 
     { 
      System.out.println("-----" + line); 
     } 
     rd.close();  
    InputStream in = connectMe.getInputStream(); 
    BufferedInputStream bin = new BufferedInputStream(in); 
    byte[] buffer = new byte[(int)connectMe.getContentLength()]; 
    int fi = 0; 
    while(fi<buffer.length) { 
     fi = fi + bin.read(buffer, fi, buffer.length - fi); 
    } 
    bin.close(); 
} catch (MalformedURLException e) { 
    e.printStackTrace(); 
} catch (IOException e) { 
    e.printStackTrace(); 
} 

通過這種方法獲得:

Exception in thread "main" java.lang.IllegalArgumentException: URLDecoder: Illegal hex characters in escape (%) pattern - For input string: "&$" 
    at java.net.URLDecoder.decode(URLDecoder.java:173) 
    at org.mssql.main.MSSQLAccess.main(MSSQLAccess.java:34) 

隨着url = url.replaceAll("%", "%25");我得到:

-----<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> 
-----<html><head> 
-----<title>400 Bad Request</title> 
-----</head><body> 
-----<h1>Bad Request</h1> 
-----<p>Your browser sent a request that this server could not understand.<br /> 
-----</p> 
-----<hr> 
java.io.IOException: Server returned HTTP response code: 400 for URL: https://mantis-daun.server.company/download/test/0022450-umlauts_öä_üüü_and_special_chars_%&$#.pdf 
-----<address>Apache/2.2.9 (Debian) PHP/5.2.6-1+lenny16 with Suhosin-Patch mod_ssl/2.2.9 OpenSSL/0.9.8o Server at mantis-daun.server.company Port 443</address> 
-----</body></html> 
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) 
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) 
    at java.lang.reflect.Constructor.newInstance(Constructor.java:513) 
    at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1491) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1485) 
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1139) 
    at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:234) 
    at org.mssql.main.MSSQLAccess.main(MSSQLAccess.java:51) 
Caused by: java.io.IOException: Server returned HTTP response code: 400 for URL: https://mantis-daun.server.company/download/test/0022450-umlauts_öä_üüü_and_special_chars_%&$#.pdf 
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436) 
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:379) 
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:318) 
at org.mssql.main.MSSQLAccess.main(MSSQLAccess.java:39) 

如果我想在一個正常的瀏覽器中打開 「URL」 我得到的也是一個「 400:壞請求「。

那麼,有沒有辦法處理與元音變音和特殊字符的字符串,以便它可以用作「URL」?

也許服務器設置也有問題嗎?

回答

0

首先,Xavjer指出,需要編碼的URL。接下來,分割URL並且僅對路徑的「文本」部分進行編碼是有意義的。域名沒有編碼(如果你有非拉丁域名,它必須根據Punycode進行編碼),還必須保留路徑分隔符(當你編碼整個URL時,情況並非如此)。所以你只編碼「下載」,「測試」和文件名+擴展部分

+0

With:「0022450-umlauts_öä_üüü_and_special_chars_%& 。$#PDF「; url2 = URLEncoder.encode(url2,「UTF-8」); url = url +「/ download/test /」+ url2;我得到:java.io.FileNotFoundException:https://mantis-daun.server.company/download/test/0022450-umlauts_%C3%B6%C3%A4_%C3%BC%C3%BC%C3%BC_and_special_chars_%25 %26%24%23.pdf!但是文件肯定存在於文件系統中。 – sk2212 2013-03-19 08:54:47

+0

那麼,你可以嘗試「ISO-8859-1」而不是「UTF-8」,但如果這不起作用,你的路徑不正確 – Xavjer 2013-03-19 09:06:11

+0

這就是訣竅!使用「ISO-8859-1」,並且只需要「編碼」url部分的答案,它現在可以正常工作。 – sk2212 2013-03-19 09:12:33

1

那麼,你嘗試解碼的網址,但你實際上應該編碼它使你的願望。它實際上是崩潰,因爲它試圖解碼%& $這是沒有有效的十六進制跡象...

編碼會導致: HTTPS%3A%2F%2Fmantis-daun.server.company%2Fdownload%2Ftest%2F0022450- umlauts_%C3%B6%C3%A4_%C3%BC%C3%BC%C3%BC_and_special_chars_%25%26%24%23.pdf

+0

當然,但這個URL也是不可用的 - > java.net.MalformedURLException:沒有協議:https%3A%2F%2Fmantis-daun。 server.company%2Fdownload%2Ftest%2F0022450-umlauts_%C3%B6%C3%A4_%C3%BC%C3%BC%C3%BC_and_special_chars_%25%26%24%23.pdf – sk2212 2013-03-19 08:44:46