PHP中的Unicode（UTF8）字符串字數爲

我需要具有以下unicode字符串的字數。使用str_word_count：PHP中的Unicode（UTF8）字符串字數爲

$input = 'Hello, chào buổi sáng'; 
$count = str_word_count($input); 
echo $count;

結果是

這是aparentley錯誤。

如何獲得想要的結果（4）？

來源

2013-08-19 Hai Truong IT

您需要定義「單詞」的含義。例如，兩個帶連字符的單詞是一個還是兩個「單詞」？「單詞」可以包含數字嗎？等等。 –

$tags = 'Hello, chào buổi sáng'; 
$word = explode(' ', $tags); 
echo count($word);

這裏有一個演示：http://codepad.org/667Cr1pQ

來源

2013-08-19 03:15:06

這裏是一個快速和骯髒的基於正則表達式，（使用Unicode）字計數功能：

function mb_count_words($string) { 
    preg_match_all('/[\pL\pN\pPd]+/u', $string, $matches); 
    return count($matches[0]); 
}

A 「字」是什麼，包含一個或更多的：

任何字母的字母
任何數字
任何連字符/破折號

這將意味着，以下含有5 「字」（4正常，1連字符）：

echo mb_count_words('Hello, chào buổi sáng, chào-sáng');

現在，這個功能是不適合對於非常大的文本;儘管它應該能夠處理大部分作爲互聯網上文本塊的內容。這是因爲preg_match_all需要構建並填充一個大數組，以便在計數後將其丟棄（這非常低效）。一種更有效的計算方法是逐個字符地瀏覽文本，識別unicode空白序列並遞增輔助變量。這不會那麼困難，但它很乏味，需要時間。

來源

2013-08-19 04:51:18

我正在使用此代碼來計算單詞。你可以試試這個

$s = 'Hello, chào buổi sáng'; 
$s1 = array_map('trim', explode(' ', $s)); 
$s2 = array_filter($s1, function($value) { return $value !== ''; }); 
echo count($s2);

來源

2018-03-05 07:24:14

您可以使用此功能，在給定的字符串來算的Unicode字：

function count_unicode_words($unicode_string){ 

    // First remove all the punctuation marks & digits 
    $unicode_string = preg_replace('/[[:punct:][:digit:]]/', '', $unicode_string); 

    // Now replace all the whitespaces (tabs, new lines, multiple spaces) by single space 
    $unicode_string = preg_replace('/[[:space:]]/', ' ', $unicode_string); 

    // The words are now separated by single spaces and can be splitted to an array 
    // I have included \n\r\t here as well, but only space will also suffice 
    $words_array = preg_split("/[\n\r\t ]+/", $unicode_string, 0, PREG_SPLIT_NO_EMPTY); 

    // Now we can get the word count by counting array elments 
    return count($words_array); 
}

所有學分轉到author。

來源

2018-03-09 17:36:40 Trix

PHP中的Unicode（UTF8）字符串字數爲

回答

相關問題