2014-02-05 38 views
0

我有一個疑問涉及到在PHP正則表達式。一個正則表達式使用php

$content = "START FIRST AAA SECOND AAA" 
$content_first = preg_replace('/START(.*)AAA/', 'REPLACED_STRING', $content); 
//$content_first == "REPLACED_STRING" 
$content_second = preg_replace('/START(.*?)AAA/', 'REPLACED_STRING', $content); 
//$content_second == "REPLACED_STRING SECOND AAA" 

爲什麼$ content_second和$ content_first不一樣? ''的目的是什麼?在正則表達式?我有以下正則表達式(真正廣泛),我想修改它,所以它取代了一個字符串中的所有網址,而不是由第一個停止,但我無法(它只是發現字符串中的第一個URL ):

$url_pattern = '/# Rev:20100913_0900 github.com\/jmrware\/LinkifyURL 
    # Match http & ftp URL that is not already linkified. 
     # Alternative 1: URL delimited by (parentheses). 
     (\()      # $1 "(" start delimiter. 
     ((?:ht|f)tps?:\/\/[a-z0-9\-._~!$&\'()*+,;=:\/?#[\]@%]+) # $2: URL. 
     (\))      # $3: ")" end delimiter. 
    | # Alternative 2: URL delimited by [square brackets]. 
     (\[)      # $4: "[" start delimiter. 
     ((?:ht|f)tps?:\/\/[a-z0-9\-._~!$&\'()*+,;=:\/?#[\]@%]+) # $5: URL. 
     (\])      # $6: "]" end delimiter. 
    | # Alternative 3: URL delimited by {curly braces}. 
     (\{)      # $7: "{" start delimiter. 
     ((?:ht|f)tps?:\/\/[a-z0-9\-._~!$&\'()*+,;=:\/?#[\]@%]+) # $8: URL. 
     (\})      # $9: "}" end delimiter. 
    | # Alternative 4: URL delimited by <angle brackets>. 
     (<|&(?:lt|\#60|\#x3c);) # $10: "<" start delimiter (or HTML entity). 
     ((?:ht|f)tps?:\/\/[a-z0-9\-._~!$&\'()*+,;=:\/?#[\]@%]+) # $11: URL. 
     (>|&(?:gt|\#62|\#x3e);) # $12: ">" end delimiter (or HTML entity). 
    | # Alternative 5: URL not delimited by(), [], {} or <>. 
     (      # $13: Prefix proving URL not already linked. 
     (?:^    # Can be a beginning of line or string, or 
     | [^=\s\'"\]]   # a non-"=", non-quote, non-"]", followed by 
     ) \s*[\'"]?   # optional whitespace and optional quote; 
     | [^=\s]\s+    # or... a non-equals sign followed by whitespace. 
    )      # End $13. Non-prelinkified-proof prefix. 
     (\b      # $14: Other non-delimited URL. 
     (?:ht|f)tps?:\/\/  # Required literal http, https, ftp or ftps prefix. 
     [a-z0-9\-._~!$\'()*+,;=:\/?#[\]@%]+ # All URI chars except "&" (normal*). 
     (?:     # Either on a "&" or at the end of URI. 
      (?!     # Allow a "&" char only if not start of an... 
      &(?:gt|\#0*62|\#x0*3e);     # HTML ">" entity, or 
      | &(?:amp|apos|quot|\#0*3[49]|\#x0*2[27]); # a [&\'"] entity if 
      [.!&\',:?;]?  # followed by optional punctuation then 
      (?:[^a-z0-9\-._~!$&\'()*+,;=:\/?#[\]@%]|$) # a non-URI char or EOS. 
     ) &     # If neg-assertion true, match "&" (special). 
      [a-z0-9\-._~!$\'()*+,;=:\/?#[\]@%]* # More non-& URI chars (normal*). 
     )*      # Unroll-the-loop (special normal*)*. 
     [a-z0-9\-_~$()*+=\/#[\]@%] # Last char can\'t be [.!&\',;:?] 
    )      # End $14. Other non-delimited URL. 
    /imx'; 

任何人都可以幫我或朝正確的方向嗎?非常感謝!


好吧,我想我明白你所有的解釋(針對TY!),沒有任何理由,只有我的第一個網址爲「一個」標籤之間,爲使?代碼的其餘部分:

$url_replace = '$1$4$7$10$13<a>$2$5$8$11$14</a>$3$6$9$12'; 
return preg_replace($url_pattern, $url_replace, $text); 

如果

$text = 
http://www.youtube.com/watch?v=Cy8duEIHEig http://www.youtube.com/watch?v=Cy8duEIHEig 

只有第一個URL顯示爲URL。這件事與*有關嗎?

回答

0

?做兩件事。首先,它可以使一個表達式可選例如

ab?c 

其中b是在你的情況

.*? 

它禁用的貪婪算法可選或第二的。* - >找到的第一個和最小匹配。

+0

你的意思是最小的意思是什麼? – user111671

+0

只要找到'?'後面的任何東西就會停下來 –

+0

@JoaoRaposo你可以檢查答案更新嗎? – user111671

0

$ content_first和$ content_second之間的差額,以及這裏解釋:What do lazy and greedy mean in the context of regular expressions?

$ content_first是貪婪的,匹配的開始之後,這意味着,它儘可能多的字符可能的匹配,所以在正則表達式的AAA後綴實際上是省略。 $ content_second導致PCRE匹配任何字符,直到它符合AAA。

+0

我想我明白了,你能檢查答案更新嗎? – user111671