從正則表達式中切斷網址無正則表達式

例如：

string1 = "bla bla bla http://bla.domain.com more blah blah nohttp.domain.with.no.protocol more text bla bla" 
(string2, urls) = wild_magic_appears(string1) 
string2 = "bla bla bla more blah blah more text bla bla" 
urls = ["http://bla.domain.com", "nohttp.domain.with.no.protocol"]

我知道，正則表達式是最好的解決辦法，但我感興趣的非正則表達式的解決方案

來源

2013-12-17 ov7a

你可以（在''分）的話字符串分割並分別考慮每個字。魔法的狂野程度取決於你想要匹配什麼，例如最簡單的要求是「以http：//，https：//開頭或包含多個點的任何單詞」。 – CompuChip

在C＃中，你可以爲開頭的URL做到這一點「http：//」

string str1 = "bla bla bla http://bla.domain.com more blah blah nohttp.domain.with.no.protocol"; 
string [] array = str1.Split(' '); 
Listr<string> urls= new List<string>(); 

foreach(var s in array) 
{ 
    if(s.StartsWith("http://")) // you can add here other conditions that match url 
    urls.Add(s); 
}

來源

2013-12-17 08:24:34

很簡單。對於那些將搜索這個問題的解決方案的人，我建議通過協議名稱，點和頂級域名列表來檢測URL（就像我一樣）。 – ov7a

紅寶石，拆分冒號和空格。

只適用於網址以http：//開頭並且您的字符串部分沒有冒號。

>a = "bla bla bla http://bla.domain.com more blah blah nohttp.domain.with.no.protocol more text bla bla" 
>a.split(":")[0].to_s[-4..-1] + ":" + a.split(":")[1].split()[0].to_s 
=> "http://bla.domain.com"

對於只有dots.I網址想不出好的解決方案。

來源

2013-12-17 08:46:46 oyss

這是一個非常狹窄的解決方案。對於使用'：'的用戶文本，這不是一個好的解決方案。 – ov7a

想想一個新的解決方案。只要分割「http：//」或「https：//」。這一個比較好處理用戶的冒號。

>a = "bla bla bla http://bla.domain.com more blah blah nohttp.domain.with.no.protocol more text bla bla" 
>("http://"+a.split("http://")[1].to_s).split()[0] 
=>"http://bla.domain.com"

來源

2013-12-18 09:01:28 oyss

從正則表達式中切斷網址無正則表達式

回答

相關問題