2013-07-01 52 views
1

是否可以檢測和刪除句子中的任何類型的URL?如何檢測並刪除一個句子的網址?

例如:

 
Today,wheather is cold.But I want to out. http://weathers.com..... And I will take a cup of tea... 

應該成爲

 
Today,wheather is cold.But I want to out. And I will take a cup of tea... 
+0

使用正則表達式。在這裏回答:http://stackoverflow.com/questions/833469/regular-expression-for-url#answer-8234912 – BackSlash

+0

define **任何類型的網址**請。 '的https://?文件:///? FTP://? SCP://? smb:// .. ...?' – Kent

+0

https://?文件:///? FTP://? SCP://? smb://,...並且還縮短了通常在twitter上使用的網址 – reigeki

回答

3

這取決於你想如何全面匹配過程是。您可以嘗試使用簡單的東西

str.replaceAll("http://[^\\s]+", "") 

例如,

System.out.println("Today,wheather is cold.But I want to out. " 
     + "http://weathers.com..... And I will take a cup of tea..." 
     .replaceAll("http://[^\\s]+", "")); 
 
Today,wheather is cold.But I want to out. And I will take a cup of tea... 

如果你想要的東西更強大的匹配有效的網址,用更全面的URL的正則表達式:

 
/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/ 

爲了更徹底的匹配,是指this答案。

+0

真正的網址(IRI)似乎更復雜http://stackoverflow.com/a/190405/2040040 – johnchen902

+0

@ johnchen902是的,我引用了該問題答案,謝謝。 – arshajii

1

試用波紋管正則表達式

((http|ftp|https):\/\/)?[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])? 

匹配您的有效URL和下面的代碼應該做的,你想要什麼:

String str = "Today,wheather is cold. But I want to out. http://weathers.com..... And I will take a cup of tea"; 
    String regularExpression = "(((http|ftp|https):\\/\\/)?[\\w\\-_]+(\\.[\\w\\-_]+)+([\\w\\-\\.,@?^=%&:/~\\+#]*[\\w\\-\\@?^=%&/~\\+#])?)"; 
    str = str.replaceAll(regularExpression,""); 
    System.out.println(str); 

編輯:

然而,這正則表達式不適用於所有類型的URL,因爲它太複雜而且很難找到完美的正則表達式來匹配所有類型的URL。