爲什麼我會從preg_replace（）函數獲得不同的回報？

我對PHP的preg_replace()函數有問題。爲什麼我會從preg_replace（）函數獲得不同的回報？

$string="[y-z]y-z_y[y_z]yav[v_v]"; // i want it to become : [y-z]yellow-zend_yellow[y_z]yav[v_v] 

$find = array('/y(?=(?:.(?!\]))*\[)/Um', '/a(?=(?:.(?!\]))*\[)/Um', '/z(?=(?:.(?!\]))*\[)/Um', '/v(?=(?:.(?!\]))*\[)/Um'); 

$replace = array('yellow', 'avocado', 'zend', 'vodka'); 

echo preg_replace($find, $replace, $string)."<br><br>"; // display [y-zend]yellow-zend_yellow[y_zend]yellowavodkaocadovodka[v_v] 

echo preg_replace('/y(?=(?:.(?!\]))*\[)/Um', 'yellow', $string)."<br><br>"; // display [y-z]yellow-z_yellow[y_z]yellowav[v_v] 

echo preg_replace('/z(?=(?:.(?!\]))*\[)/Um', 'zend', $string)."<br><br>"; // display [y-zend]y-zend_y[y_zend]yav[v_v] --Why displaying zend inside[]?

另外，我想知道是否有一種方法有一個附加條件做這個簡單的PHP：如果有「] [」之間的「YAV」字符串，我想忽略它。

**[y-z]y-z_y[y_z]yav[v_v] ==> [y-z]yellow-zend_yellow[y_z]yav[v_v]**

$var=[y-z]y-z[y_z]yav[v_v]; ==> $var=[y-z]yellow-zend[y_z]yav[v_v];

來源

2015-07-22 Mouad

你如何從'] ya'告訴''y-'在這裏，我剛剛改變了你的生活大聲笑，點擊我> https：// regex101。com/ – ArtisticPhoenix

do string_replace（'y-z_y'，'yellow-zend_yellow'，'[y-z] y-z_y [y_z] yav [v_v]'）; – ArtisticPhoenix

@ArtisiticPhoenix不，不工作如果var是：[yz] yz [y_z] yav [v_v] ==> [yz] yellow-zend [y_z] yav [v_v]我需要一些東西來處理所有情況，使用大量內存並用特定字符串替換每個單詞 – Mouad

最後Z]匹配becase的你告訴它使用正展望未來匹配負面看，它基本上是一個矛盾的說法。

你告訴它匹配z如果lookahead匹配，而不匹配你不想要的，所以它匹配你不想要的，並說它確定匹配。無論如何，這在我的腦海中是有道理的。

https://regex101.com/r/nX5dQ6/1

您可以量化你的規則來匹配多個字符序列，它肯定更容易與yellow-zend_yellow取代y-z_y但沒有上下文，這是不可能的說，如果這是可能的。

/z(?=(?:.(?!\]))*\[)/Um 
    z matches the character z literally (case sensitive) 
    (?=(?:.(?!\]))*\[) Positive Lookahead - Assert that the regex below can be matched 
     (?:.(?!\]))* Non-capturing group 
      Quantifier: * Between zero and unlimited times, as few times as possible, expanding as needed [lazy] 
      . matches any character (except newline) 
      (?!\]) Negative Lookahead - Assert that it is impossible to match the regex below 
       \] matches the character ] literally 
     \[ matches the character [ literally 
    U modifier: Ungreedy. The match becomes lazy by default. Now a ? following a quantifier makes it greedy 
    m modifier: multi-line. Causes^and $ to match the begin/end of each line (not only begin/end of string)

個人我可能會做一個標記生成器它的想法是preg_match_all改爲使用，這樣

$matches = null; 
$returnValue = preg_match_all('/(?P<T_OPEN>\[)|(?P<T_CLOSE>\])|(?P<T_Y>y)|(?P<T_X>x)|(?P<T_Z>z)|(?P<T_SEPH>\-)|(?P<T_SEPU>\_)/', '[y-z]y-z_y[y_z]yav[v_v]', $matches, PREG_PATTERN_ORDER);

array (
    0 => 
     array (
       0 => '[', 
       1 => 'y', 
       2 => '-', 
       3 => 'z', 
       4 => ']', 
      ... 
     ), 
    'T_OPEN' => 
     array (
      0 => '[', 
      1 => '', 
      2 => '', 
      3 => '', 
      4 => '', 
    ..

並與一些後處理這可以簡化成一個令牌列表

array('T_OPEN', 'T_Y', 'T_SEPH', 'T_Z', 'T_CLOSE', ...);

哪些是命名的捕獲組，那麼這是很平凡的，如果你是[]，或不含在寫一些邏輯來確定，或者如果T_Y，T_X，T_Z是之前由另一T_Y，T_X， T_Z令牌，這是最可靠的方法。

處理它下降到只有令牌上的[0] [0]比賽使用一個for循環，看看別人有這樣的值（未測試，但是這是它的基礎）

$total = count($matches[0][0]); 
    // remove numbered keys this is just an array of all the string keys, our tokens 
$tokens = array_filter(function($item){ 
     return preg_match('/^[^0-9]/', $item); 
}, array_keys($matches)); 
$tokens[] = 'T_UNKNOWN'; //add a default token for validation 

$tokenstream = array(); 
for($i=0; $i<$total; $i++){ 
    //loop through the matches for the index, 
     foreach($tokens as $token){ 
      //loop through the tokens and check $matches[$token][$i] for length 
      if(strlen($matches[$token][$i]) > 0){ 
        break; //break out of the foreach when we find our token which is now in $token - if we don't find it it's the last token T_UNKNOWN 
      } 
      } 
     $tokenstream[] = $token; 
}

然後你從頭開始使用令牌建立自己的字符串，

$out = ''; 
$literal = false; 

    foreach($tokenstream as $offset => $token){ 
     switch($token){ 
      case 'T_OPEN': 
        $out .= '['; 
        $literal = true; //start brackets 
      break; 
      case 'T_CLOSE': 
        $out .= ']'; 
        $literal = false; //end brackets 
      break; 
      case 'T_SEPH': 
        $out .= '-'; 
      break; 
      case 'T_Y': 
        if($literal){ //if inside brackets literal y 
         $out .= 'y'; 
        }else{ // else use the word yellow 
         $out .= 'yellow'; 
        } 
      break; 
      case 'T_UNKNOWN': 
        //validate 
        throw new Exception("Error unknown token at offset: $offset"); 

     } 
    }

你還是會需要弄清楚的T_Z後跟一個T_A，等等，等等，但是這將是一個肯定火的方式做它，並避免以上所有的混亂。此外，這是一個非常粗糙的思考這樣的問題的方式。

來源

2015-07-22 02:26:21 ArtisticPhoenix

不，先生，如果我有：$ var = [y-z] y-z [y_z] yav [v_v]; ==> $ var = [y-z] yellow-zend [y_z] yav [v_v] ;.以及如何在字符串中輸入單詞yav – Mouad

我應該怎麼知道，這是你的問題，什麼是一些樣本輸入，一個輸入不足以解決這個問題。有很多方法可以做到這一點，你可以使用[y-z]組preg_split等。 – ArtisticPhoenix

謝謝，這是非常接近我的舊解決方案與數組，循環和開關，但它需要大量變量的內存。因爲這個原因，我認爲如果我使用替換函數的簡單方法，我減少了一些內存使用量，但似乎不可能。順便說一下，我需要這樣做一些HTML enc0ding我加載大量的HTML源到內存工作。以編碼

爲例：<%70>＆＃％78％37％30;我需要用不同的值替換單詞和單詞之間的單詞。感謝您的意見我非常感謝^^' – Mouad

爲什麼我會從preg_replace（）函數獲得不同的回報？

回答

相關問題