2012-04-08 74 views
0

我有一些文字被包裹在[quote][/quote]中,我試圖匹配這些標籤之前的所有文本,這些標籤之間的所有內容以及這些標籤之後的所有內容。問題在於它們可能有多次出現,但不在彼此之內。preg_match_all越來越奇怪

我這樣做的原因是因爲我想對這些標記之外的所有文本運行過濾器,無論是否存在多個事件。

這就是我開始一起工作:

preg_match_all("/(^.*)\[quote\](.*?)\[\/quote\](.*)/si", $reply['msg'], $getthequotes); 

下面是輸出:

Array 
(
[0] => Array 
    (
     [0] => putting some stuff before the quote 
[quote][b]Logan said[/b][br]testing this youtube link http://www.youtube.com/watch?v=8UVNT4wvIGY&feature=g-music&context=G2db8219YMAAAAAAAAAA[br][br]did it work?[br][br][i]04/04/12 23:48:46: Edited by Logan(2)[/i][br][br][i]04/04/12 23:55:44: Edited by Logan(2)[/i][/quote] 

yep 

http://www.youtube.com/watch?v=8UVNT4wvIGY&feature=g-music&context=G2db8219YMAAAAAAAAAA 

adding a quote 

[quote][b]Logan said[/b][br]This is the start of the second quote http://www.youtube.com/watch?v=8UVNT4wvIGY&feature=g-music&context=G2db8219YMAAAAAAAAAA[br][br]did it work?[br][br][i]04/04/12 23:48:46: Edited by Logan(2)[/i][br][br][i]04/04/12 23:55:44: Edited by Logan(2)[/i][/quote] 

[i]04/07/12 20:18:07: Edited by Logan(2)[/i] 
    ) 

[1] => Array 
    (
     [0] => putting some stuff before the quote 

[quote][b]Logan said[/b][br]testing this youtube link http://www.youtube.com/watch?v=8UVNT4wvIGY&feature=g-music&context=G2db8219YMAAAAAAAAAA[br][br]did it work?[br][br][i]04/04/12 23:48:46: Edited by Logan(2)[/i][br][br][i]04/04/12 23:55:44: Edited by Logan(2)[/i][/quote] 

yep 

http://www.youtube.com/watch?v=8UVNT4wvIGY&feature=g-music&context=G2db8219YMAAAAAAAAAA 

adding a quote 


    ) 

[2] => Array 
    (
     [0] => [b]Logan said[/b][br]This is the start of the second quote http://www.youtube.com/watch?v=8UVNT4wvIGY&feature=g-music&context=G2db8219YMAAAAAAAAAA[br][br]did it work?[br][br][i]04/04/12 23:48:46: Edited by Logan(2)[/i][br][br][i]04/04/12 23:55:44: Edited by Logan(2)[/i] 
    ) 

[3] => Array 
    (
     [0] => 

[i]04/07/12 20:18:07: Edited by Logan(2)[/i] 
    ) 

) 

正如你可以看到它沒有得到所需的輸出。任何幫助,將不勝感激。

+0

啊......不是HTML的標記語言 - 肯定的正則表達式最終將正確的工具? – 2012-04-08 00:37:39

+0

我有自定義的bbcode像標籤被分析成HTML。所有的正則表達式解析都是在PHP中完成的。 – 2012-04-08 00:42:53

+1

對不起,我有點諷刺,根據這個[非常流行的謬論](http://stackoverflow.com/a/1732454/596781)。答案是,*不要*使用正則表達式,因爲它們不是正確的工具。 – 2012-04-08 00:45:32

回答

1

我還沒有試過這個,但你只想要[quote]之前和[/quote]之後的東西,你可以爲首次出現的開始引號標籤做一個strpos。現在你知道以前沒有引用的所有內容。

接下來,您可以使用從第一個匹配的引號標籤的索引開始的strpos來查找結束引號標籤。你可以放棄這些東西。

現在使用您剛剛找到的結束報價標籤的起始位置爲下一個報價塊做另一個結果。你可以重複這個,直到你到最後。

+0

另外,如果你想要嵌套,首先搜索第一個'[/ quote]',然後從那裏搜索* [back] *以打開'[quote]' - 這會給你最內層的報價。根據需要進行格式化,然後沖洗並重復。 – mpen 2012-04-08 02:43:23

+0

我需要所有的。我只需要對非引號文本進行額外的處理。我想我可以做到這一點,雖然保存每個在它自己的var然後連接所有的部分重新組合在一起...有點屁股倒退,但我想它會工作。 – 2012-04-08 03:06:03

+0

是的,如果將部件連接在一起,它應該可以工作。對於那個很抱歉。是的,它是一種天真的算法,但它不應該爲你的目的太慢。實際上,我認爲我從Udacity 101類中得到了這個想法,他們使用類似的方法在html頁面中分析鏈接。 – Gohn67 2012-04-08 03:26:22

0

它可以完成,但您需要在字符串上進行多次傳遞。

$string = 'putting some stuff before the quote 
[quote][b]Logan said[/b][br]testing this youtube link http://www.youtube.com/watch?v=8UVNT4wvIGY&feature=g-music&context=G2db8219YMAAAAAAAAAA[br][br]did it work?[br][br][i]04/04/12 23:48:46: Edited by Logan(2)[/i][br][br][i]04/04/12 23:55:44: Edited by Logan(2)[/i][/quote] 

yep 

http://www.youtube.com/watch?v=8UVNT4wvIGY&feature=g-music&context=G2db8219YMAAAAAAAAAA 

adding a quote 

[quote][b]Logan said[/b][br]This is the start of the second quote http://www.youtube.com/watch?v=8UVNT4wvIGY&feature=g-music&context=G2db8219YMAAAAAAAAAA[br][br]did it work?[br][br][i]04/04/12 23:48:46: Edited by Logan(2)[/i][br][br][i]04/04/12 23:55:44: Edited by Logan(2)[/i][/quote] 

[i]04/07/12 20:18:07: Edited by Logan(2)[/i]putting some stuff before the quote 

[quote][b]Logan said[/b][br]testing this youtube link http://www.youtube.com/watch?v=8UVNT4wvIGY&feature=g-music&context=G2db8219YMAAAAAAAAAA[br][br]did it work?[br][br][i]04/04/12 23:48:46: Edited by Logan(2)[/i][br][br][i]04/04/12 23:55:44: Edited by Logan(2)[/i][/quote] 

yep 

http://www.youtube.com/watch?v=8UVNT4wvIGY&feature=g-music&context=G2db8219YMAAAAAAAAAA 

adding a quote'; 

//get rid of whitespace 
$string = preg_replace('%\s\s?%', " ",$string); 
//break the string on a common element 
$pieces = preg_split('%\[%',$string); 
//now discard the elements that are tags 
foreach($pieces as $key=>$value): 
    $value = trim($value); 
    if(strrpos($value,"]") == (strlen($value) -1)): 
     unset($pieces[$key]); 
    endif; 
endforeach; 
print_r($pieces); 
//and finally strip out the tag fragments 
foreach($pieces as $key=>$value): 
    $pieces[$key] = preg_replace('%.*]%',"",$value); 
endforeach; 

結果是一個數組,看起來像這樣:

Array 
(
    [0] => putting some stuff before the quote 
    [2] => Logan said 
    [4] => testing this youtube link http://www.youtube.com/watch?v=8UVNT4wvIGY&feature=g-music&context=G2db8219YMAAAAAAAAAA 
    [6] => did it work? 
    [9] => 04/04/12 23:48:46: Edited by Logan(2) 
    [13] => 04/04/12 23:55:44: Edited by Logan(2) 
    [15] => yep http://www.youtube.com/watch?v=8UVNT4wvIGY&feature=g-music&context=G2db8219YMAAAAAAAAAA adding a quote 
    [17] => Logan said 
    [19] => This is the start of the second quote http://www.youtube.com/watch?v=8UVNT4wvIGY&feature=g-music&context=G2db8219YMAAAAAAAAAA 
    [21] => did it work? 
    [24] => 04/04/12 23:48:46: Edited by Logan(2) 
    [28] => 04/04/12 23:55:44: Edited by Logan(2) 
    [31] => 04/07/12 20:18:07: Edited by Logan(2) 
    [32] => putting some stuff before the quote 
    [34] => Logan said 
    [36] => testing this youtube link http://www.youtube.com/watch?v=8UVNT4wvIGY&feature=g-music&context=G2db8219YMAAAAAAAAAA 
    [38] => did it work? 
    [41] => 04/04/12 23:48:46: Edited by Logan(2) 
    [45] => 04/04/12 23:55:44: Edited by Logan(2) 
    [47] => yep http://www.youtube.com/watch?v=8UVNT4wvIGY&feature=g-music&context=G2db8219YMAAAAAAAAAA adding a quote 
)