2012-09-01 85 views
1

我需要能夠計算特定單詞在特定html標記中出現的次數。目前,我只能算出標籤中顯示的單詞總數。而且我可以統計文檔中單詞顯示總數的次數,但我無法弄清楚如何計算單詞在h3標籤中出現的次數。什麼,我需要PHP計算特定單詞在h3標記中出現的次數

實例:

Sample text here, blah blah blah, lorem ipsum 
<h3>Lorem is in this h3 tag, lorem.</h3> 
lorem ipsum dolor.... 
<h3>This is another h2 with lorem in it</h3> 

所以當你看到這個詞「排版」是在該代碼的4倍,但我只想要算多少次這個詞「排版」顯示, h3標籤的。

我寧願繼續在這個項目上使用PHP。

非常感謝您的幫助

+3

使用DOM訪問HTML文檔中的元素。 – knittl

+0

..並使用[substr_count()](http://php.net/manual/en/function.substr-count.php) – Peter

回答

0

您還可以使用正則表達式來做到這一點:

<?php 
    $string = 'Sample text here, blah blah blah, lorem ipsum 
    <h3>Lorem is in this h3 tag, lorem.</h3> 
    lorem ipsum dolor.... 
    <h3>This is another h2 with lorem in it</h3>'; 

    preg_match_all("/lorem(?=(?:.(?!<h3>))*<\/h3>)/i", $string, $matches); 

    if (isset($matches[0])) { 
     $count = count($matches[0]); 
    } else { 
     $count = 0; 
    } 

?> 
+0

無論包含多少次lorem,似乎總是返回「2」在h3 – Casey

+0

正則表達式已被更新,以匹配每個標籤內的一個單詞的多次出現。 – virtubill

2

我會用DOMDocument這樣的:

$string = 'Sample text here, blah blah blah, lorem ipsum 
<h3>Lorem is in this h3 tag, lorem.</h3> 
lorem ipsum dolor.... 
<h3>This is another h2 with lorem in it</h3>'; 

$html = new DOMDocument(); // create new DOMDocument 
$html->loadHTML($string); // load HTML string 
$cnt = array();   // create empty array for words count 
foreach($html->getElementsByTagName('h3') as $one){ // loop in each h3 
    $words = str_word_count(strip_tags($one->nodeValue), 1, '0..9'); // count words including numbers 
    foreach($words as $wo){ // create an key for every word 
     if(!isset($cnt[$wo])){ $cnt[$wo] = 0; } // create key if it doesn't exit add 0 as word count 
     $cnt[$wo]++; // increment it's value each time it's repeated - this will result in the word having count 1 on first loop 
    } 
} 


var_export($cnt); // dump words and how many it repeated 
相關問題