2014-06-30 37 views
0

我想將一個句子拆分爲一個段落,並且每個段落的單詞數量應該少於幾個。例如:根據單詞數量將句子拆分爲段數

Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source. 

Paragraph 1: 
Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. 

Paragraph 2: 
Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source. 

在上述例子中,詞語小於20處於第1段和其餘的是在第2段

有沒有任何方法來實現這一使用PHP?

我試過$abc = explode(' ', $str, 20);這將存儲數組中的20個單詞,然後其餘的最後一個數組$ abc ['21']。我如何從前20個數組中提取數據作爲第一段,然後將其餘數據作爲第二段?

+0

你的最後一段 '我已經試過......' 是完全錯誤的,請重新改寫它。 – Athafoud

+0

您可以嘗試將字符串轉換爲數組,然後將前20個字符存儲在一個字符串中,其餘字符存儲在另一個字符串中。 – Aradhna

+0

在炸開句子之後,只需使用'implode'即可。 http://stackoverflow.com/questions/5956610/how-to-select-first-10-words-of-a-sentence – TribalChief

回答

0

首先將字符串拆分成句子。然後循環結束語句數組,首先將句子添加到段落數組中,然後計算段數組元素中的單詞,如果大於19個遞增段落計數器。

$string = 'Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source.'; 

$sentences = preg_split('/(?<=[.?!;])\s+(?=\p{Lu})/', $string); 

$ii = 0; 
$paragraphs = array(); 
foreach ($sentences as $value) { 
    if (isset($paragraphs[$ii])) { $paragraphs[$ii] .= $value; } 
    else { $paragraphs[$ii] = $value; } 
    if (19 < str_word_count($paragraphs[$ii])) { 
     $ii++; 
    } 
} 
print_r($paragraphs); 

輸出:這裏找到

Array 
(
    [0] => Contrary to popular belief, Lorem Ipsum is not simply random text.It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. 
    [1] => Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source. 
) 

句子分配器:Splitting paragraphs into sentences with regexp and PHP

相關問題