PHP Unicode到UTF-8代碼

我試圖獲取unicode字符串的UTF-8字節（十進制）。例如：PHP Unicode到UTF-8代碼

function unicode_to_utf8_bytes($string) { 

} 

$text = 'Hello '; 
$result = unicode_to_utf8_bytes($text); 

var_dump($result); 

array(10) { 
    [0]=> 
    int(72) 
    [1]=> 
    int(101) 
    [2]=> 
    int(108) 
    [3]=> 
    int(108) 
    [4]=> 
    int(111) 
    [5]=> 
    int(32) 
    [6]=> 
    int(240) 
    [7]=> 
    int(159) 
    [8]=> 
    int(152) 
    [9]=> 
    int(128) 
}

結果的例子可以看這裏：

http://apps.timwhitlock.info/unicode/inspect?s=Hello+%F0%9F%98%80

我覺得我很近，這就是我設法：

function utf8_char_code_at($str, $index) { 

    $char = mb_substr($str, $index, 1, 'UTF-8'); 

    if (mb_check_encoding($char, 'UTF-8')) { 
     $ret = mb_convert_encoding($char, 'UTF-32BE', 'UTF-8'); 
     return hexdec(bin2hex($ret)); 
    } 
    else 
     return null; 

} 

function unicode_to_utf8_bytes($str) { 

    $result = array(); 

    for ($i=0; $i<mb_strlen($str, '8bit'); $i++) 
     $result[] = utf8_char_code_at($str, $i); 

    return $result; 

} 

$string = 'Hello '; 

var_dump(unicode_to_utf8_bytes($string)); 

array(10) { 
    [0]=> 
    int(72) 
    [1]=> 
    int(101) 
    [2]=> 
    int(108) 
    [3]=> 
    int(108) 
    [4]=> 
    int(111) 
    [5]=> 
    int(32) 
    [6]=> 
    int(128512) 
    [7]=> 
    int(0) 
    [8]=> 
    int(0) 
    [9]=> 
    int(0) 
}

任何幫助將不勝感激！

來源

2016-01-03 Ana Aiza

很抱歉，但目前還不清楚是什麼你實際上是試圖做...' UTF-8是unicode字符的一種可能表示，其他的確存在。因此，「從Unicode到UTF-8的轉換」實際上並不合理。那麼當你說「unicode」時，你的意思是什麼？「UTF-8字節」是什麼意思？ – arkascha

[這可能有幫助]（http://stackoverflow.com/questions/1836152/using-php-to-convert-ascii-character-to-decimal-equivalent）只需在答案中的所有字符中調用該函數你的字符串和它*應該*工作。 – segFault

這讓你要找的結果：

$string = 'Hello '; 
var_export(ascii_to_dec($string)); 

function ascii_to_dec($str) 
{ 
    for ($i = 0, $j = strlen($str); $i < $j; $i++) { 
    $dec_array[] = ord($str{$i}); 
    } 
    return $dec_array; 
}

結果：

array (
    0 => 72, 
    1 => 101, 
    2 => 108, 
    3 => 108, 
    4 => 111, 
    5 => 32, 
    6 => 240, 
    7 => 159, 
    8 => 152, 
    9 => 128, 
)

Source

來源

2016-01-03 19:08:42 segFault

非常感謝！它按預期工作！ –

你應該給這個添加一些解釋。我認爲假設你的源文件被編碼爲UTF-8，該字符串已經包含一個UTF-8編碼的字符串。在這種情況下，該函數將更準確地命名爲'bytes_to_dec'。 – roeland

PHP Unicode到UTF-8代碼

回答

相關問題