2014-03-12 21 views
0

這個概念是我有一組關鍵字和一篇文章。我想知道如果找出這些關鍵詞是否存在於這組文章中,最好的辦法是考慮性能和速度。PHP - 在文章中查找關鍵字集合

基本上,關鍵字由3個或更多字組成,但不超過10個字。它會查看文章中是否存在關鍵字,然後它將只返回文章中找到的關鍵字。

假設我們有一個文章:

$articles = "Maybe it’s less true than it used to be that people are made of 
     place--that the same elements that form coal and clay and bogs and ice form 
     faces, voices and characters. I wrote my first collection of short stories, 
     The Bostons, in homage to this book, hoping, as did Joyce’s young Stephen 
     Dedalus, to encounter for the millionth time the reality of experience and to 
     forge in the smithy of my soul the uncreated conscience of some island-dwellers 
     I knew." 

關鍵詞:

$keywords = "less true than, people are made, smithy of my soul, uncreated 
      conscience, this is a test string" 

出來放畝是:

"less true than, people are made, smithy of my soul, uncreated conscience" 

我已經用

$articles = mb_split(' +', $articles); 
    foreach ($articles as $key => $word) 
$articles [$key] = trim($word); 

    //Search for keywords  
    $keywords = str_replace(' ', '', $keywords); 
    $keywords = mb_split('[ ,]+', mb_strtolower($keywords, 'utf-8')); 

    $result = implode(',', array_intersect($keywords, $articles); 
對其編程

但它只適用於每個關鍵字。我不知道怎麼做多個關鍵字。

+0

所以你的關鍵字實際上可以由多個單詞組成?例如,你的例子中的一個「關鍵字」是「不如真」,對吧? – Nico

+0

你聽說過正則表達式和preg_match? –

回答

0

strpos()是你需要的。此作品 -

$res = Array(); 
foreach(explode(", ",$keywords) as $keyword){ 
    if(strpos($articles, $keyword)){ 
     $res[] = $keyword; 
    } 
} 
$matched = implode($res,", "); 
var_dump($matched); 
/** OUTPUT **/ 
string 'less true than, people are made, smithy of my soul, uncreated conscience' (length=72) 
0

Regular Expressions可以幫助你。 這可行,你可以看到here。 您的問題可能是關鍵字字符串中的中斷?

$articles = "Maybe it’s less true than it used to be that people are made of 
    place--that the same elements that form coal and clay and bogs and ice form 
    faces, voices and characters. I wrote my first collection of short stories, 
    The Bostons, in homage to this book, hoping, as did Joyce’s young Stephen 
    Dedalus, to encounter for the millionth time the reality of experience and to 
    forge in the smithy of my soul the uncreated conscience of some island-dwellers 
    I knew."; 

$keywords = "less true than, people are made, smithy of my soul, uncreated conscience, this is a test string"; 

$keywordsArray = explode(', ',$keywords); 

$pattern = '/'.implode('|',$keywordsArray).'/'; 
preg_match_all($pattern,$articles,$matches); 

var_dump($matches); 
0
$matches = array_unique(
    preg_match_all(
     '/'.implode('|', explode(', ', $keywords).'/', 
     $articles 
    ) 
); 
0

$文章=「也許是真少比它曾經是人們所做出的 的地方 - 即形成煤,粘土和沼澤和冰的形式 的面孔,聲音相同的元素我寫了我的第一本短篇小說集, Bostons在致敬這本書時,希望像喬伊斯的年輕史蒂夫Dedalus一樣,在第一百萬次遇到經驗的現實,並且在的僞造下,我的靈魂是我知道的一些島嶼居民的沒有創造的良知 。「 ;

$ keywords =「比真實的人少,我的靈魂是鐵匠鋪,沒有良心的良心,這是一個測試字符串」;

$ keyword = explode(',',$ keywords);

的foreach($關鍵字AS $密鑰=> $值){

if(strpos($articles,$value)) { 

     $finalstring .= $value.','; 
} 
} 

回波$ finalstring;