2012-04-29 131 views
3

我試圖按字符拆分字符串字符,但是我遇到了特殊字符的問題。 我目前使用以下功能:如何按字符拆分字符串,注意特殊字符

<?php 
$input = "Comment ça va?"; 
$array_input = str_split($input, 1); 
print_r($array_input); 
?> 

下面是輸出:

Array (
[0] => C [1] => o [2] => m [3] => m [4] => e 
[5] => n [6] => t [7] => [8] => � [9] => � 
[10] => a [11] => [12] => v [13] => a [14] => ?) 

我已經換行了同樣的問題:

輸入:
「他!
Oui?「

輸出:

Array ([0] => H [1] => � [2] => � [3] => ! [4] => 
[5] => [6] => O [7] => u [8] => i [9] => ?) 

是否有人有針對此問題的解決方案? 非常感謝。

回答

3

str_split Unicode字符串有問題。

可以使用u修飾符preg_split代替

例如:

$input = "Comment ça va?"; 
$letters1 = str_split($input); 
$letters2 = preg_split('//u', $input, -1, PREG_SPLIT_NO_EMPTY); 

print_r($letters1); 
print_r($letters2); 

將輸出

Array ([0] => C [1] => o [2] => m [3] => m [4] => e 
     [5] => n [6] => t [7] => [8] => � [9] => � 
     [10] => a [11] => [12] => v [13] => a [14] => ?) 

Array ([0] => C [1] => o [2] => m [3] => m [4] => e 
     [5] => n [6] => t [7] => [8] => ç [9] => a 
     [10] => [11] => v [12] => a [13] => ?) 
+0

謝謝您的回答。它適用於特殊字符,但不適用於換行符: 輸入:hé! oui?數組([0] => h [1] => [2] => [3] => [4] => [5] => o [6] => u [7] =>我[8] =>?) – Zorkzyd 2012-04-29 14:51:33

+1

@Zorkzyd:它實際上是在工作:位置3和4分別是\ r和\ n ...(嘗試'ord($ letters [3])''''ord($字母[4])',你將分別得到13和10,這是'\ r'和'\ n'的ASCII碼。 – nico 2012-04-29 14:58:30

+0

謝謝你的解釋。是否有可能在輸出的數組中「合併」\ r \ n? – Zorkzyd 2012-04-29 15:03:43

2

這是因爲PHP的str_split功能並不多字節安全的,即它無法正確處理Unicode。您可以使用此功能來代替,這是str_split

function mb_str_split($string) { 
    # Split at all position not after the start:^
    # and not before the end: $ 
    return preg_split('/(?<!^)(?!$)/u', $string); 
} 

多字節安全的實現(來源:網友評論在PHP documentation

+0

謝謝大安,但尼科的答案似乎更容易:) – Zorkzyd 2012-04-29 15:05:55

+0

不客氣!祝你好運 :) – Daan 2012-04-29 15:32:59