2014-07-08 66 views
1

我想使用preg_replace刪除樣式標記中包含的任何東西。例如:preg_replace任何樣式標記表達式

<img src="image.jpg" style="float:left;" /> 

將更改爲:

<img src="image.jpg" /> 

同樣:

<a href="link.html" style="color:#FF0000;" class="someclass">Link</a> 

將更改爲:

<a href="link.html" class="someclass">Link</a> 

我怎麼會寫這個正則表達式?

preg_replace('EXPRESSION', '', $string); 

回答

1

這應該工作:

preg_replace("@(<[^<>]+)\sstyle\=[\"\'][^\"\']+[\"\']([^<>]+>)@i", '$1$2', $string); 
+0

謝謝。非常感激! – JROB

+0

你應該添加\ s?可選的領先空間就在風格之前,或者你留下一個雙倍空間,:-p,也是你的第二個離開「鏈接」,因爲沒有捕獲組,並且你匹配標籤的起始<標籤 – ArtisticPhoenix

+0

@ArtisiticPhoenix同意。更新。 –

1

查找style="..."是封閉的內部<>與匹配組替換$1$2

(<.*)style="[^"]*"([^>]*>) 

Online Demo


這裏是working sample code

示例代碼:

<?php 
    $re = "/(<.*)style=\"[^\"]*\"([^>]*>)/"; 
    $str = "<img src=\"image.jpg\" style=\"float:left;\" />\n\n<a href=\"link.html\" style=\"color:#FF0000;\" class=\"someclass\">Link</a>"; 
    $subst = '$1$2'; 

    $result = preg_replace($re, $subst, $str); 
    print $result; 
?> 

輸出:

<img src="image.jpg" /> 

<a href="link.html" class="someclass">Link</a> 
0

這是最好的我能想出

$re = "/\sstyle\=('|\").*?(?<!\\\\)\1/i"; 
$str = "<a href=\"link.html\" style=\"color:#FF0000;\"\" class=\"someclass\">Link</a>"; 
$subst = ''; 

$result = preg_replace($re, $subst, $str, 1); 

輸出

<a href="link.html" class="someclass">Link</a> 

演示:

http://regex101.com/r/uW2kB8/8

說明:

\s match any white space character [\r\n\t\f ] 
style matches the characters style literally (case insensitive) 
\= matches the character = literally 
1st Capturing group ('|") 
    1st Alternative: ' 
     ' matches the character ' literally 
    2nd Alternative: " 
     " matches the character " literally 
.*? matches any character (except newline) 
    Quantifier: Between zero and unlimited times, as few times as possible, expanding as needed [lazy] 
(?<!\\) Negative Lookbehind - Assert that it is impossible to match the regex below 
    \\ matches the character \ literally 
\1 matches the same text as most recently matched by the 1st capturing group 
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z]) 

甚至會處理這樣的情況。

<a href="link.html" style="background-image:url(\"..\somimage.png\");" class="someclass">Link</a> 

<a href="link.html" style="background-image:url('..\somimage.png');" class="someclass">Link</a> 

和(它不會刪除)

<a href="link.html" data-style="background-image:url('..\somimage.png');" class="someclass">Link</a> 

甚至

<a href='link.html' style='color:#FF0000;' class='someclass'>Link</a> 

http://regex101.com/r/uW2kB8/11

不像其他建議:)

3

我建議使用正確的tool作爲工作,並避免使用正則表達式。

$dom = new DOMDocument; 
$dom->loadHTML($html); 

$xpath = new DOMXPath($dom); 

foreach ($xpath->query('//*[@style]') as $node) { 
    $node->removeAttribute('style'); 
} 

echo $dom->saveHTML(); 

Working Demo

如果必須使用正則表達式完成這個任務,下面就足夠了。

$html = preg_replace('/<[^>]*\Kstyle="[^"]*"\s*/i', '', $html); 

說明

<   # '<' 
[^>]*  # any character except: '>' (0 or more times) 
\K   # resets the starting point of the reported match 
style=" # 'style="' 
    [^"]*  # any character except: '"' (0 or more times) 
    "   # '"' 
\s*   # whitespace (\n, \r, \t, \f, and " ") (0 or more times) 

Working Demo

+0

+1提供**都是DOM解決方案和高效的正則表達式! :) – zx81