2012-02-17 24 views
4

此問題與我製作的一個before相關,但由於該主題現在已關閉,我需要進一步提出問題,因此希望能夠提出一個新問題。DOM Parser突出顯示關鍵字不起作用

在我之前的回答中,我簡化了這個問題,並導致了簡單但不完全正常的解決方案。當我執行我的代碼時,我意識到了這一點。

上一篇文章中解決方案的問題在於HTML代碼被替換函數破壞。我已經閱讀了許多這個網站的帖子,我需要使用DOM解析器。我對此非常不熟悉,我嘗試了post中用戶「ircmaxell」建議的代碼,但它對我無效。

這裏是我做過什麼樣:

echo '<style type="text/css"> 
     .ht{ 
     background-color: yellow; 
     } 
    </style>'; 


/* taken from user ircmaxell at https://stackoverflow.com/questions/4081372/highlight-keywords-in-a-paragraph 

I just modified line $highlight->setAttribute('class', 'highlight') to $highlight->setAttribute('class', 'ht') and commented the first 2 lines */ 

function highlight_paragraph($string, $keyword) { 
    //$string = '<p>foo<b>bar</b></p>'; 
    //$keyword = 'foo'; 
    $dom = new DomDocument(); 
    $dom->loadHtml($string); 
    $xpath = new DomXpath($dom); 
    $elements = $xpath->query('//*[contains(.,"'.$keyword.'")]'); 
    foreach ($elements as $element) { 
    foreach ($element->childNodes as $child) { 
    if (!$child instanceof DomText) continue; 
    $fragment = $dom->createDocumentFragment(); 
    $text = $child->textContent; 
    $stubs = array(); 
    while (($pos = stripos($text, $keyword)) !== false) { 
     $fragment->appendChild(new DomText(substr($text, 0, $pos))); 
     $word = substr($text, $pos, strlen($keyword)); 
     $highlight = $dom->createElement('span'); 
     $highlight->appendChild(new DomText($word)); 
     $highlight->setAttribute('class', 'ht'); 
     $fragment->appendChild($highlight); 
     $text = substr($text, $pos + strlen($keyword)); 
    } 
    if (!empty($text)) $fragment->appendChild(new DomText($text)); 
    $element->replaceChild($fragment, $child); 
    } 
} 
$string = $dom->saveXml($dom->getElementsByTagName('body')->item(0)->firstChild); 
return $string; 
} 


$string = '<p>This book has been written against a background of both reckless optimism and reckless despair.</p> 
<p>It holds that Progress and Doom are two sides of the same medal; that both are articles of superstition, not of faith. It was written out of the conviction that it should be possible to discover the hidden mechanics by which all traditional elements of our political and spiritual world were dissolved into a conglomeration where everything seems to have lost specific value, and has become unrecognizable for human comprehension, unusable for human purpose.</p> 
<p> Hannah Arendt, The Origins of Totalitarianism (New York: Harcourt Brace Jovanovich, Inc., 1973 ed.), p.vii, Preface to the First Edition.</p>'; 

$keywords = array('This', 'book', 'has', 'been', 'written', 'background', 'reckless', 'optimism', 'despair.', 'holds', 'Progress', 'Doom ', 'two', 'sides', 'medal;', 'articles', 'superstition,', 'faith.', 'lost', 'Arendt,', 'Totalitarianism'); 

foreach ($keywords as $kw) { 
    $string = highlight_paragraph($string, $kw); 
} 

echo $string; 

回聲$字符串只返回:

This book has been written against a background of both reckless optimism and reckless despair. 

而且只有前兩個詞, '這' 和 '書' 被突出顯示。

通常,它應該輸出所有初始字符串並突出顯示關鍵字。

我已經搜索了很多在stackoverflow和谷歌,並沒有找到一個易於使用的代碼來實現我的目的,即使有很多人曾問過同樣的事情。

我真的需要這裏的幫助。提前致謝!

回答

7

你很幸運,我是非常感謝當我看到這個問題時感到無聊。 ;)

您作爲答案收到的代碼似乎沒有經過測試 - 我不知道它如何可能正常工作。無論如何,我固定的所有問題,併爲您呈現一個工作版本 - 我的本地安裝的Apache服務器上測試使用PHP 5.3:

function highlight_paragraph($string, $keyword) { 
    $dom = new DOMDocument(); 
    $dom->loadHtml($string); 

    // Search for all text blocks containing the keyword 
    $xpath = new DOMXpath($dom); 
    $textNodes = $xpath->query('//*[contains(.,"'.$keyword.'")]/text()'); 

    foreach ($textNodes as $textNode) { 
    $fragment = $dom->createDocumentFragment(); 
    $text = $textNode->nodeValue; 
    $stubs = array(); 

    while (($pos = stripos($text, $keyword)) !== false) { 
     $fragment->appendChild(new DOMText(substr($text, 0, $pos))); 
     $word = substr($text, $pos, strlen($keyword)); 

     $highlight = $dom->createElement('span'); 
     $highlight->appendChild(new DOMText($word)); 
     $highlight->setAttribute('class', 'ht'); 
     $fragment->appendChild($highlight); 

     $text = substr($text, $pos + strlen($keyword)); 
    } 

    if (!empty($text)) 
     $fragment->appendChild(new DOMText($text)); 

    $textNode->parentNode->replaceChild($fragment, $textNode); 
} 

return $dom->saveHTML(); 
} 
+0

這個答案幫助了[我的問題](http://stackoverflow.com/questions/ 15526781 /正則表達式 - 負 - 先行-回顧後到排除-HTML從 - 查找和R)。謝謝! – TerranRich 2013-03-20 17:29:48

+1

非常感謝您的無聊! :-) – 2014-07-30 13:30:43

+0

Omg,終於。 @denisw你是一個傳奇。 我看到這個錯誤,雖然我對結果運行它: 「嚴重性:警告 消息:DOM文檔:: loadHTML():htmlParseEntityRef:在實體沒有名字」 任何想法? – Solvision 2016-11-05 23:26:00

相關問題