2012-03-13 116 views
1

我試圖捕獲<pre>標記中的屬性以及可選的類標記。我想在一個正則表達式中捕獲類標籤的內容,而不是捕獲所有屬性,然後在可能的情況下查找類屬性值。由於類標記是可選的,因此我嘗試添加一個?,但這會導致以下正則表達式僅使用最後一個捕獲組捕獲 - 該類未被捕獲,並且之前的屬性也不是。正則表達式可選類標記

// Works, but class isn't optional 
'(?<!\$)<pre([^\>]*?)(\bclass\s*=\s*(["\'])(.*?)\3)([^\>]*)>' 

// Fails to match class, the whole set of attributes are matched by last group 
'(?<!\$)<pre([^\>]*?)(\bclass\s*=\s*(["\'])?(.*?)\3)([^\>]*)>' 

e.g. <pre style="..." class="some-class" title="stuff"> 

編輯:

我結束了使用此:

$wp_content = preg_replace_callback('#(?<!\$)<\s*pre(?=(?:([^>]*)\bclass\s*=\s*(["\'])(.*?)\2([^>]*))?)([^>]*)>(.*?)<\s*/\s*pre\s*>#msi', 'CrayonWP::pre_tag', $wp_content); 

它允許標籤內的空白,也前後類屬性後分隔的東西,以及捕捉所有屬性。

然後回調把東西的地方:

public static function pre_tag($matches) { 
    $pre_class = $matches[1]; 
    $quotes = $matches[2]; 
    $class = $matches[3]; 
    $post_class = $matches[4]; 
    $atts = $matches[5]; 
    $content = $matches[6]; 
    if (!empty($class)) { 
     // Allow hyphenated "setting-value" style settings in the class attribute 
     $class = preg_replace('#\b([A-Za-z-]+)-(\S+)#msi', '$1='.$quotes.'$2'.$quotes, $class); 
     return "[crayon $pre_class $class $post_class] $content [/crayon]"; 
    } else { 
     return "[crayon $atts] $content [/crayon]"; 
    } 
} 

回答

4

你可以把捕獲組爲class屬性在先行斷言,使其可選:

'(?<!\$)<pre(?=(?:[^>]*\bclass\s*=\s*(["\'])(.*?)\1)?)([^>]*)>' 

現在,$2將包含如果存在,則爲class屬性的值。

(?<!\$)    # Assert no preceding $ (why?) 
<pre     # Match <pre 
(?=     # Assert that the following can be matched: 
(?:     # Try to match this: 
    [^>]*    # any text except > 
    \bclass\s*=\s*  # class = 
    (["\'])   # opening quote 
    (.*?)    # any text, lazy --> capture this in group no. 2 
    \1     # corresponding closing quote 
)?     # but make the whole thing optional. 
)      # End of lookahead 
([^\>]*)>    # Match the entire contents of the tag and the closing > 
+0

宏偉,謝謝! – 2012-03-13 11:49:50