2013-05-06 211 views
2

我有一個國家的數組關鍵是國家代碼,值是國家名稱,現在我有一個字符串,這是由用戶發佈,我想查找如果字符串已是全國n,其中查找並替換國家字符串

<span class="country">$1</span> 

代替它,使之更加清晰:比方說,我有這樣的文字:

Canada is a cold place 

我希望它是:

<span class="country">canada</span> is a cold place 

其中我使用我的國家數組查找和repalce。

背後的原因是我想使用微格式,所以我需要從字符串中提取特定的文本。

我有類似的preg_replaces代碼

$style = array(
        '/\[b\](.*)?\[\/b\]/isU'   => '<b>$1</b>', 
        '/\[i\](.*)?\[\/i\]/isU'   => '<i>$1</i>', 
        '/\[u\](.*)?\[\/u\]/isU'   => '<u>$1</u>', 
        '/\[em\](.*)?\[\/em\]/isU'  => '<em>$1</em>', 
        '/\[li\](.*)?\[\/li\]/isU'  => '<li>$1</li>', 
        '/\[code\](.*)?\[\/code\]/isU' => '<div class="tx_code">$1</div>', 
        '/\[q\](.*)?\[\/q\]/isU' => '<q>$1</q>', 
        '/[\r\n]{3}+/'    => "\n" 
        ); 

$text = preg_replace(array_keys($style),array_values($style),$text); 

其作品中,我需要類似的東西。

請記住,它不應該是大小寫敏感的,一些用戶可能會發布加拿大或加拿大

感謝

+1

所以,你寫類似的代碼。你是如何嘗試調整到目前爲止?我不清楚問題在哪。 – erisco 2013-05-06 16:33:24

+0

感謝您的提示,經過與其中一個答案的鬥爭後,我寫了我的,它的工作完美。 – Emad 2013-05-06 19:30:00

回答

1

試試這個

function findword($text,array $List){ 
     foreach($List as $Val) 
      $pattern['%([^\da-zA-Z]+)'.$Val.'([^\da-zA-Z]+)%si'] = '<span class="country">'.$Val.'</span>'; 
     $text = preg_replace(array_keys($pattern), array_values($pattern), ' '.$text.' '); 
     return $text; 
    } 
    echo findword('Canada is a cold place',array('Canada')); 

輸出:

<span class="country">Canada</span>is a cold place 

編輯:如果你想替換文本中的所有匹配詞,你可以使用這個

function findword($text,array $List){ 
     foreach($List as $Val) 
      $pattern['~'.$Val.'~si'] = '<span class="country">'.$Val.'</span>'; 
     $text = preg_replace(array_keys($pattern), array_values($pattern), ' '.$text.' '); 
     return $text; 
    } 
    echo findword('Canadaisacold place',array('Canada')); 

輸出:

<span class="country">Canada</span>isacold place 

EDIT2:我通過DOM文檔寫的那工作的良好的HTML

class XmlRead{  
    static function Clean($html){ 
    $html=preg_replace_callback("~<script(.*?)>(.*?)</script>~si",function($m){ 
     //print_r($m); 
    // $m[2]=preg_replace("/\/\*(.*?)\*\/|[\t\r\n]/s"," ", " ".$m[2]." "); 
     $m[2]=preg_replace("~//(.*?)\n~si"," ", " ".$m[2]." "); 
     //echo $m[2]; 
     return "<script ".$m[1].">".$m[2]."</script>"; 
     }, $html); 
    $search = array(
     "/\/\*(.*?)\*\/|[\t\r\n]/s" => "", 
     "/ +\{ +|\{ +| +\{/" => "{", 
     "/ +\} +|\} +| +\}/" => "}", 
     "/ +: +|: +| +:/" => ":", 
     "/ +; +|; +| +;/" => ";", 
     "/ +, +|, +| +,/" => "," 
    ); 
     $html = preg_replace(array_keys($search), array_values($search), $html); 
    preg_match_all('!(<(?:code|pre|script).*>[^<]+</(?:code|pre|script)>)!',$html,$pre); 
    $html = preg_replace('!<(?:code|pre).*>[^<]+</(?:code|pre)>!', '#pre#', $html); 
    $html = preg_replace('#<!–[^\[].+–>#', '', $html); 
    $html = preg_replace('/[\r\n\t]+/', ' ', $html); 
    $html = preg_replace('/>[\s]+</', '><', $html); 
    $html = preg_replace('/\s+/', ' ', $html); 
    if (!empty($pre[0])) { 
     foreach ($pre[0] as $tag) { 
      $html = preg_replace('!#pre#!', $tag, $html,1); 
     } 
    } 
    return($html); 
} 
function loadNprepare($content,$encod='') { 
    $content=self::Clean($content); 
    //$content=html_entity_decode(html_entity_decode($content)); 
    // $content=htmlspecialchars_decode($content,ENT_HTML5); 
    $DataPage=''; 
    if(preg_match('~<body(.*?)>(.*?)</body>~si',$content,$M)){ 
     $DataPage=$M[2]; 
    }else{ 
     $DataPage =$content; 
    } 
    $HTML=$DataPage; 
    $HTML="<!doctype html><html><head><meta charset=\"utf-8\"><title>Untitled Document</title></head><body>".$HTML."</body></html>"; 
    $dom= new DOMDocument; 
    $HTML = str_replace("&", "&amp;", $HTML); // disguise &s going IN to loadXML() 
    // $dom->substituteEntities = true; // collapse &s going OUT to transformToXML() 
    $dom->recover = TRUE; 
    @$dom->loadHTML('<?xml encoding="UTF-8">' .$HTML); 
    // dirty fix 
    foreach ($dom->childNodes as $item) 
    if ($item->nodeType == XML_PI_NODE) 
     $dom->removeChild($item); // remove hack 
    $dom->encoding = 'UTF-8'; // insert proper 
    return $dom; 
} 
function GetBYClass($Doc,$ClassName){ 
    $finder = new DomXPath($Doc); 
    return($finder->query("//*[contains(@class, '$ClassName')]")); 
} 
function findword($text,array $List){ 
    foreach($List as $Val) 
     $pattern['%(\#)?([^\da-zA-Z]+)'.$Val.'([^\da-zA-Z]+)%si'] = '<span class="country">'.$Val.'</span>'; 
    $text = preg_replace(array_keys($pattern), array_values($pattern), ' '.$text.' '); 
    return $text; 
} 
function FindAndReplace($node,array $List) { 
    if($node==NULL)return false;  
    if (XML_TEXT_NODE === $node->nodeType || XML_CDATA_SECTION_NODE === $node->nodeType) { 
     $node->nodeValue=$this->findword($node->nodeValue,$List); 
     return; 
    }else{ 
     if(is_object($node->childNodes) or is_array($node->childNodes)) { 
      foreach($node->childNodes as $childNode) { 
       $this->FindAndReplace($childNode,$List); 
      } 
     } 
    } 

} 
function DOMinnerHTML($element) 
{ 
    $innerHTML = ""; 
    $children = $element->childNodes; 
    foreach ($children as $child) 
    { 
     $tmp_dom = new DOMDocument(); 
     $tmp_dom->appendChild($tmp_dom->importNode($child, true)); 
     $innerHTML.=trim($tmp_dom->saveHTML()); 
    } 
    $innerHTML=html_entity_decode(html_entity_decode($innerHTML)); 
    return $innerHTML; 
} 
function DOMRemove(DOMNode $from) { 

    $from->parentNode->removeChild($from);  
} 

} 
$XmlRead=new XmlRead(); 
$Doc=$XmlRead->loadNprepare('<a href="?Canada">Canada</a> is a cold place'); 
$XmlRead->FindAndReplace($Doc,array('Canada')); 
$Body=$Doc->getElementsByTagName('body')->item(0); 
echo $XmlRead->DOMinnerHTML($Body); 

輸出

<a href="?Canada"><span class="country">Canada</span></a>is a cold place 
+0

它的工作原理,謝謝,但它速度快嗎? – Emad 2013-05-06 16:48:33

+0

有多少數據用於查找和替換 – 2013-05-06 16:50:30

+0

所有國家的字,字符串長度最大爲500字符,順便說一句,如果我寫這個:incanadaisacoldplace,沒有空格,它不會exctract權 – Emad 2013-05-06 16:51:41

0

我寫我自己,這是迄今最好的:

if($microformat){ 
     foreach ($this->countries as $co){ 
     $text = preg_replace('/(\#)?\b'.$co.'\b/isU','<span class="country">$0</span>',$text); 
     } 
    } 

謝謝大家