2014-03-06 25 views
0

我們已經構建了一個平臺,允許用戶在HTML輸入屬性內添加特殊的<#tags#> ...我使用了preg_replace_callback來查找所有匹配的輸入形成主體字符串,然後處理它們並返回包含所有更新的輸入元素的整個表單的修改字符串。PHP preg_replace_callback扼流器上的「a:」值

我已經將問題縮小到以字母和冒號開頭的最後一個屬性值。那就是打破了正則表達式,並使得它拋出一個「PREG_BACKTRACK_LIMIT_ERROR」

<input onclick="javascript:blah();"> 

會打破它的唯一案例。我已經告訴開發者他們應該使用onclick =「blah()」,但是這已經過去了,瀏覽器也支持它,所以他們仍然希望它能夠工作。

<input onclick=":blah();"> 

不會破壞它。這讓我認爲這是某種內部存儲使用「key:value」對來存儲引用或其他東西,而它所分析的數據本身就是打破了這個數據模式。

一個真正奇怪的事情是,代碼在Google應用引擎PHP上產生不同的結果,而在PHP 5.3.3上運行在centos上......本地PHP在更多情況下拋出錯誤。

這裏是測試代碼和測試結果:

<?php 

process_string("<input type=\"button\" value=\"update google doc\" onclick=\"javascript:getgoogledoc();\">"); 
process_string("<input type=\"button\" value=\"update google doc\" onclick=\":getgoogledoc();\">"); 
process_string("<input type=\"button\" value=\"update google doc\" onclick=\"getgoogledoc();\">"); 
process_string("<input type=\"button\" value=\"update google doc\" onclick=\"getgoogledoc();\" newattribute=\"javascript:test();\">"); 
process_string("<input type=\"button\" value=\"update google doc\" onclick=\"a:getgoogledoc();\">"); 
process_string("<input type=\"a:button\" value=\"javascript:update google doc\">"); 
process_string("<input type=\"button\" value=\"javascript:update google doc\" <# this makes it match #> onclick=\"javascript:getgoogledoc();\">"); 
process_string("<input type=\"button\" value=\"javascript:update google doc\" <# this makes it match #> onclick=\"getgoogledoc();\">"); 

function process_string($string) { 
    echo "<p><b>NEW TEST</b><br />initial string:<br />"; 
    echo htmlspecialchars($string); 
    $string = preg_replace_callback(
     '/<\s*input\s+((\s*(\w+)\s*=\s*(\'(\\\\\\\\|\\\\\'|[^\'])*\'|"(\\\\\\\\|\\\\"|[^"])*"|(\w+))|\s*(\w+))*\s*)<#\s*(.*?)\s*#>((\s*(\w+)\s*=\s*(\'(\\\\\\\\|\\\\\'|[^\'])*\'|"(\\\\\\\\|\\\\"|[^"])*"|(\w+))|\s*(\w+))*\s*)(\/\s*|)>/is', 
     function($matches) { 
      echo "<br />matched something..."; 
      return $matches[0]; 
     }, 
     $string 
    ); 
    echo "<br />ok... ran the regex replace callback... string is now:<br />"; 
    echo htmlspecialchars($string); 
    $last_error = preg_last_error(); 
    echo "<br />the last regex error was: $last_error"; 
    if($last_error==PREG_NO_ERROR) { 
     echo "<br />that is a PREG_NO_ERROR"; 
    } 
    if($last_error==PREG_INTERNAL_ERROR) { 
     echo "<br />that is a PREG_INTERNAL_ERROR"; 
    } 
    if($last_error==PREG_BACKTRACK_LIMIT_ERROR) { 
     echo "<br />that is a PREG_BACKTRACK_LIMIT_ERROR"; 
    } 
    if($last_error==PREG_RECURSION_LIMIT_ERROR) { 
     echo "<br />that is a PREG_RECURSION_LIMIT_ERROR"; 
    } 
    if($last_error==PREG_BAD_UTF8_ERROR) { 
     echo "<br />that is a PREG_BAD_UTF8_ERROR"; 
    } 
    if($last_error==PREG_BAD_UTF8_OFFSET_ERROR) { 
     echo "<br />that is a PREG_BAD_UTF8_OFFSET_ERROR"; 
    } 
} 

?> 

結果:

NEW TEST 
initial string: 
<input type="button" value="update google doc" onclick="javascript:getgoogledoc();"> 
ok... ran the regex replace callback... string is now: 

the last regex error was: 2 
that is a PREG_BACKTRACK_LIMIT_ERROR 

NEW TEST 
initial string: 
<input type="button" value="update google doc" onclick=":getgoogledoc();"> 
ok... ran the regex replace callback... string is now: 
<input type="button" value="update google doc" onclick=":getgoogledoc();"> 
the last regex error was: 0 
that is a PREG_NO_ERROR 

NEW TEST 
initial string: 
<input type="button" value="update google doc" onclick="getgoogledoc();"> 
ok... ran the regex replace callback... string is now: 
<input type="button" value="update google doc" onclick="getgoogledoc();"> 
the last regex error was: 0 
that is a PREG_NO_ERROR 

NEW TEST 
initial string: 
<input type="button" value="update google doc" onclick="getgoogledoc();" newattribute="javascript:test();"> 
ok... ran the regex replace callback... string is now: 

the last regex error was: 2 
that is a PREG_BACKTRACK_LIMIT_ERROR 

NEW TEST 
initial string: 
<input type="button" value="update google doc" onclick="a:getgoogledoc();"> 
ok... ran the regex replace callback... string is now: 

the last regex error was: 2 
that is a PREG_BACKTRACK_LIMIT_ERROR 

NEW TEST 
initial string: 
<input type="a:button" value="javascript:update google doc"> 
ok... ran the regex replace callback... string is now: 
<input type="a:button" value="javascript:update google doc"> 
the last regex error was: 0 
that is a PREG_NO_ERROR 

NEW TEST 
initial string: 
<input type="button" value="javascript:update google doc" <# this makes it match #> onclick="javascript:getgoogledoc();"> 
matched something... 
ok... ran the regex replace callback... string is now: 
<input type="button" value="javascript:update google doc" <# this makes it match #> onclick="javascript:getgoogledoc();"> 
the last regex error was: 0 
that is a PREG_NO_ERROR 

NEW TEST 
initial string: 
<input type="button" value="javascript:update google doc" <# this makes it match #> onclick="getgoogledoc();"> 
matched something... 
ok... ran the regex replace callback... string is now: 
<input type="button" value="javascript:update google doc" <# this makes it match #> onclick="getgoogledoc();"> 
the last regex error was: 0 
that is a PREG_NO_ERROR 

回答

2

PREG_BACKTRACK_LIMIT_ERROR發生由於過度回溯,並且可以使用Possessive Quantifiers
給這個修改正則表達式一個處理嘗試(請注意,我在^表示的位置添加了+量詞) -

'/<\s*input\s+((\s*(\w+)\s*=\s*(\'(\\\\\\\\|\\\\\'|[^\'])*\'|"(\\\\\\\\|\\\\"|[^"])*"|(\w+))|\s*(\w+))*+\s*)<#\s*(.*?)\s*#>((\s*(\w+)\s*=\s*(\'(\\\\\\\\|\\\\\'|[^\'])*\'|"(\\\\\\\\|\\\\"|[^"])*"|(\w+))|\s*(\w+))*\s*)(\/\s*|)>/is' 
                            ^
+0

非常感謝你...它的工作原理,我還沒有找到任何迴歸 –