2013-05-21 100 views
2

我有一個正則表達式(?<={% start %}).*?(?={% end %})匹配兩個自定義標籤之間的所有內容。如何匹配正則表達式中的多個(N)空格?

的問題是,如果存在內部空間的標籤(例如, 「{%     開始%}」)和I添加\s+?條件,正則表達式失敗。下面的代碼不起作用:(?<={%\s+?start\s+?%}).*?(?={%\s+?end\s+?%}),我在PHP中得到一個錯誤:

preg_match_all(): Compilation failed: lookbehind assertion is not fixed length at offset 25 

同樣的正則表達式的作品,如果我刪除前瞻/回顧後:({%\s+?(start|end)\s+%})

請指教。

+0

你有一些示例文本? –

+1

根據語言的不同,lookbehind不能是可變長度。 – Toto

+1

下面是示例文本:http://pastebin.com/AUX1hd2T我還更新了我的問題和錯誤消息。我使用PHP。 – MarkL

回答

3

說明

試試這個permlink

[{]%\s*?\b([^}]*start[^}]*)\b\s*?%[}]\s*?\b(.*?)\b\s*?[{]%\s*\b([^}]*end[^}]*)\b\s*%[}] 

這將匹配您的{%%}括號內的所有文本,並把價值到他們組之前將自動修剪的文本。

集團0獲取整個匹配的字符串

  1. 得到的開始標記文本
  2. 得到內部文本
  3. 獲取結束標記文本

enter image description here

免責聲明

這可能會有一些邊緣情況下,如果你有複雜的數據嵌套到子,正則表達式將失敗,如果是的話,那麼使用正則表達式可能不是這項任務的最佳工具。

摘要

[{]%\s*?\b([^}]*start[^}]*)\b\s*?%[}]\s*?\b(.*?)\b\s*?[{]%\s*\b([^}]*end[^}]*)\b\s*%[}] 
Char class [{] matches one of the following chars: { 
% Literal `%` 
\s 0 to infinite times [lazy] Whitespace [\t \r\n\f] 
\b Word boundary: match in between (^\w|\w$|\W\w|\w\W) 
1st Capturing group ([^}]*start[^}]*) 
Negated char class [^}] infinite to 0 times matches any char except: } 
start Literal `start` 
Negated char class [^}] infinite to 0 times matches any char except: } 
\b Word boundary: match in between (^\w|\w$|\W\w|\w\W) 
\s 0 to infinite times [lazy] Whitespace [\t \r\n\f] 
% Literal `%` 
Char class [}] matches one of the following chars: } 
\s 0 to infinite times [lazy] Whitespace [\t \r\n\f] 
\b Word boundary: match in between (^\w|\w$|\W\w|\w\W) 
2nd Capturing group (.*?) 
. 0 to infinite times [lazy] Any character (except newline) 
\b Word boundary: match in between (^\w|\w$|\W\w|\w\W) 
\s 0 to infinite times [lazy] Whitespace [\t \r\n\f] 
Char class [{] matches one of the following chars: { 
% Literal `%` 
\s infinite to 0 times Whitespace [\t \r\n\f] 
\b Word boundary: match in between (^\w|\w$|\W\w|\w\W) 
3rd Capturing group ([^}]*end[^}]*) 
Negated char class [^}] infinite to 0 times matches any char except: } 
end Literal `end` 
Negated char class [^}] infinite to 0 times matches any char except: } 
\b Word boundary: match in between (^\w|\w$|\W\w|\w\W) 
\s infinite to 0 times Whitespace [\t \r\n\f] 
% Literal `%` 
Char class [}] matches one of the following chars: } 

PHP例如

與示例文本 {% start %} this is a sample text 1 {% end %}{% start %} this is a sample text 2 {% end %}

<?php 
$sourcestring="your source string"; 
preg_match_all('/[{]%\s*?\b([^}]*start[^}]*)\b\s*?%[}]\s*?\b(.*?)\b\s*?[{]%\s*\b([^}]*end[^}]*)\b\s*%[}]/i',$sourcestring,$matches); 
echo "<pre>".print_r($matches,true); 
?> 

$matches Array: 
(
    [0] => Array 
     (
      [0] => {% start %} this is a sample text 1 {% end %} 
      [1] => {% start %} this is a sample text 2 {% end %} 
     ) 

    [1] => Array 
     (
      [0] => start 
      [1] => start 
     ) 

    [2] => Array 
     (
      [0] => this is a sample text 1 
      [1] => this is a sample text 2 
     ) 

    [3] => Array 
     (
      [0] => end 
      [1] => end 
     ) 

) 
+0

謝謝,但我需要正則表達式來匹配標籤之間的文本,而不是內部。 – MarkL

+0

編輯的迴應令人滿意 –

+0

謝謝,但PHP似乎並沒有採取。你可以在這裏測試它:http://www.pagecolumn.com/tool/pregtest.htm(使用PCRE和PREG_MATCH_ALL)。 – MarkL