2012-05-29 36 views
1

我創建了一個正則表達式,它讀取一個字符串並將找到的url轉換爲HTML鏈接。我想排除行尾的點(包含文本鏈接),但它也排除文本鏈接內的點(如http://www.website.com/page.html中所示)。此處的結束點應排除在外,但不包括.html。這是我的正則表達式:正則表達式的字符串中的鏈接排除點只在行尾

$text = preg_replace("#(^|[\n \"\'\(<;:,\*])((www|ftp)\.+[a-zA-Z0-9\-_]+\.[^ \"\'\t\n\r< \[\]\),>;:.\*]*)#", "\\1<a href=\"http://\\2\" target=\"_blank\">\\2</a>", $text);       

怎麼會這樣做?

Thanx!湯姆

回答

4

您正則表達式改成這樣

\b((?#protocol)https?|ftp)://((?#domain)[-A-Z0-9.]+)((?#file)/[-A-Z0-9+&@#/%=~_|!:,.;]*)?((?#parameters)\?[A-Z0-9+&@#/%=~_|!:,.;]*)? 

或本

\b((?:https?|ftp|file)://[-A-Z0-9+&@#/%?=~_|$!:,.;]*[A-Z0-9+&@#/%=~_|$]*)\b 

說明

" 
\b       # Assert position at a word boundary 
(       # Match the regular expression below and capture its match into backreference number 1 
           # Match either the regular expression below (attempting the next alternative only if this one fails) 
     http       # Match the characters 「http」 literally 
     s        # Match the character 「s」 literally 
     ?        # Between zero and one times, as many times as possible, giving back as needed (greedy) 
    |        # Or match regular expression number 2 below (attempting the next alternative only if this one fails) 
     ftp       # Match the characters 「ftp」 literally 
    |        # Or match regular expression number 3 below (the entire group fails if this one fails to match) 
     file       # Match the characters 「file」 literally 
) 
://       # Match the characters 「://」 literally 
[-A-Z0-9+&@#/%?=~_|\$!:,.;] # Match a single character present in the list below 
           # The character 「-」 
           # A character in the range between 「A」 and 「Z」 
           # A character in the range between 「0」 and 「9」 
           # One of the characters 「+&@#/%?=~_|\$!:,.;」 
    *        # Between zero and unlimited times, as many times as possible, giving back as needed (greedy) 
[A-Z0-9+&@#/%=~_|\$]   # Match a single character present in the list below 
           # A character in the range between 「A」 and 「Z」 
           # A character in the range between 「0」 and 「9」 
           # One of the characters 「+&@#/%=~_|\$」 
" 

希望這會有所幫助。

+0

Thanx提前!看起來很有前途。我稍後會檢查它,因爲在我們的情況下測試它需要相當長的一段時間。我一定會回到它。 – tvgemert

相關問題