PHP預浸匹配，所有的捕捉

我想在PHP與preg_match_all捕捉每一種在自己的組：PHP預浸匹配，所有的捕捉

的章，節或頁面
，如果它的數字（或字母有一個）指定的章節，章節或頁面。如果他們之間有一個空格應該考慮到
詞「與」，「或」

牢記我想忽略所有的書名，並在項目的數量字符串可能是動態的，正則表達式應該下面所有的例子工作：

通道1和Sect2b
章4×unwantedtitle和教派5Y不必要的標題和Sect6 z和Ch7的或CH8

這是我設法拿出這麼遠：

$str = 'Ch 1 a unwantedtitle and Sect 2b unwanted title and Pg3'; 
    preg_match_all ('/([a-z]+)(?=\d|\d\s)\s*(\d*)\s*(?<=\d|\d\s)([a-z]?).*?(and|or)?/i', $str, $matches); 

    Array 
    (
     [0] => Array 
      (
       [0] => Pg3 
      ) 

     [1] => Array 
      (
       [0] => Pg 
      ) 

     [2] => Array 
      (
       [0] => 3 
      ) 

     [3] => Array 
      (
       [0] => 
      ) 

     [4] => Array 
      (
       [0] => 
      ) 

    )

預期的結果應該是：

Array 
    (
     [0] => Array 
      (
       [0] => Ch 1 a and 
       [1] => Sect 2b and 
       [2] => Pg3 
      ) 

     [1] => Array 
      (
       [0] => Ch 
       [1] => Sect 
       [2] => Pg 
      ) 

     [2] => Array 
      (
       [0] => 1 
       [1] => 2 
       [2] => 3 
      ) 

     [3] => Array 
      (
       [0] => a 
       [1] => b 
       [2] => 
      ) 

     [4] => Array 
      (
       [0] => and 
       [1] => and 
       [2] => 
      ) 

    )

來源

2013-01-13 user1307016

不確定你是否真的想用_one_ regex來做到這一點。用幾個看起來更好。 – fge

@fge我怎麼能夠使用幾個正則表達式，同時仍然保持一切按正確的順序？如果你有一個例子，將不勝感激。謝謝。 – user1307016

不在PHP中，我幾乎不知道它... – fge

這是最接近我能得到：

$str = 'Ch 1 a unwantedtitle and Sect 2b unwanted title and Pg3'; 
preg_match_all ('/((Ch|Sect|Pg)\s?(\d+)\s?(\w?))(.*?(and|or))?/i', $str, $matches); 


Array 
(
    [0] => Array 
     (
      [0] => Ch 1 a unwantedtitle and 
      [1] => Sect 2b unwanted title and 
      [2] => Pg3 
     ) 

    [1] => Array 
     (
      [0] => Ch 1 a 
      [1] => Sect 2b 
      [2] => Pg3 
     ) 

    [2] => Array 
     (
      [0] => Ch 
      [1] => Sect 
      [2] => Pg 
     ) 

    [3] => Array 
     (
      [0] => 1 
      [1] => 2 
      [2] => 3 
     ) 

    [4] => Array 
     (
      [0] => a 
      [1] => b 
      [2] => 
     ) 

    [5] => Array 
     (
      [0] => unwantedtitle and 
      [1] => unwanted title and 
      [2] => 
     ) 

    [6] => Array 
     (
      [0] => and 
      [1] => and 
      [2] => 
     ) 

)

來源

2013-01-13 19:23:57 Westy92

這是我該怎麼做的。

$arr = array(
    'Ch1 and Sect2b', 
    'Ch 1 a unwantedtitle and Sect 2b unwanted title and Pg3', 
    'Ch 4 x unwantedtitle and Sect 5y unwanted title and' . 
     ' Sect6 z and Ch7 or Ch8a', 
    'Assume this is ch1a and ch 2 or ch seCt 5c.' . 
     ' Then SECT or chA pg22a and pg 13 andor' 
); 

foreach ($arr as $a) { 
    var_dump($a); 
    preg_match_all(
    '~ 
     \b(?P<word>ch|sect|(pg)) 
     \s*(?P<number>\d+) 
     (?(2)\b| 
      \s* 
      (?P<letter>(?!(?<=\s)(?:and|or)\b)[a-z]+)? 
      \s* 
      (?:(?<=\s)(?P<cond>and|or)\b)? 
     ) 
    ~xi' 
    ,$a,$m); 
    foreach ($m as $k => $v) { 
     if (is_numeric($k) && $k !== 0) unset($m[$k]); 
     // this is for 'beautifying' the result array 
     // note that $m[0] will still return whole matches 
    } 
    print_r($m); 
}

我不得不把pg成捕獲組，因爲我需要明確寫入的條件爲，這是，它可以被附加一個數字（帶或不帶之間的空間），但它不能被追加任何考慮頁面指示符的字母都不會有像「pg23a」中的字母。

這就是爲什麼我選擇命名每個組，並通過代碼中的內部foreach循環「美化」結果。否則，如果您選擇使用數字索引（而非命名索引），則需要跳過每個$m[2]。

要顯示一個示例，請輸入$arr中最後一項的輸出。

Array 
(
    [0] => Array 
     (
      [0] => ch1a and 
      [1] => ch 2 or 
      [2] => seCt 5c 
      [3] => pg 13 
     ) 

    [word] => Array 
     (
      [0] => ch 
      [1] => ch 
      [2] => seCt 
      [3] => pg 
     ) 

    [number] => Array 
     (
      [0] => 1 
      [1] => 2 
      [2] => 5 
      [3] => 13 
     ) 

    [letter] => Array 
     (
      [0] => a 
      [1] => 
      [2] => c 
      [3] => 
     ) 

    [cond] => Array 
     (
      [0] => and 
      [1] => or 
      [2] => 
      [3] => 
     ) 

)

來源

2013-01-14 00:07:26 inhan

PHP預浸匹配，所有的捕捉

回答

相關問題