2010-07-09 137 views
2

我正在CakePHP站點上保存MacRoman char編碼。我想將所有文件更改爲UTF-8進行國際化。對於網站中的所有其他文件,這可以正常工作。但是,在core.php文件中有一個安全鹽,它是一個帶有特殊字符(「!:*」)的字符串。當我把這個文件保存爲UTF-8時,鹽會被損壞。 git的,但它是一個煩惱Char編碼:將文件從MacRoman更改爲UTF-8中斷字符串

有誰知道我可以將字符串從的MacRoman轉換爲UTF-8

回答

5

你沒有提供足夠的信息來證實這一點,但我猜鹽是以二進制形式使用的。在這種情況下,即使正確轉換了字符,如果更改此二進制流,則更改該文件的編碼將會破壞salt。

由於UTF-8和Mac OS Roman中的前128個字符相似,因此您不必擔心salt是否僅使用這些字符來編寫。

比方說,鹽是地方:

$salt = "a!c‡Œ"; 

你可以代替寫:

$salt = "a!c\xE0\xCE"; 

您可以將所有映射到其十六進制表示,因爲它可能會更容易實現自動化:

$salt = "\x61\x21\x63\xE0\xCE"; 

見表here

下面的代碼片段可以自動完成這一轉換:

$res = ""; 
foreach (str_split($salt) as $c) { 
    $res .= "\\x".dechex(ord($c)); 
} 
echo $res; 
+0

這幫助我走到了一個很長的兔子洞的盡頭。非常感謝。隨着你的片段,我能夠真正看到我的角色發生了什麼,並深入瞭解我的編碼問題。 – 2014-06-15 03:49:55

0

只是好奇,你有沒有嘗試複製鹽,保存爲UTF-8,然後將鹽粘貼到原位並再次儲存?

+1

是的,我應該提到我曾嘗試過,但沒有幫助 – igniteflow 2010-07-09 09:50:39

4

感謝您的輸入,指出我在正確的方向。解決的辦法是:

$salt = iconv('UTF-8', 'macintosh', $string); 
2

對於那些沒有進入到這裏的iconv是誰在PHP函數: http://sebastienguillon.com/test/jeux-de-caracteres/MacRoman_to_utf8.txt.php 將正確的MacRoman文本轉換爲UTF-8,你甚至可以決定要如何突破連字。

<?php 
function MacRoman_to_utf8($str, $break_ligatures='none') 
{ 
    // $break_ligatures : 'none' | 'fifl' | 'all' 
    // 'none' : don't break any MacRoman ligatures, transform them into their utf-8 counterparts 
    // 'fifl' : break only fi ("\xDE" => "fi") and fl ("\xDF"=>"fl") 
    // 'all' : break fi, fl and also AE ("\xAE"=>"AE"), ae ("\xBE"=>"ae"), OE ("\xCE"=>"OE") and oe ("\xCF"=>"oe") 

    if($break_ligatures == 'fifl') 
    { 
     $str = strtr($str, array("\xDE"=>"fi", "\xDF"=>"fl")); 
    } 

    if($break_ligatures == 'all') 
    { 
     $str = strtr($str, array("\xDE"=>"fi", "\xDF"=>"fl", "\xAE"=>"AE", "\xBE"=>"ae", "\xCE"=>"OE", "\xCF"=>"oe")); 
    } 

    $str = strtr($str, array("\x7F"=>"\x20", "\x80"=>"\xC3\x84", "\x81"=>"\xC3\x85", 
    "\x82"=>"\xC3\x87", "\x83"=>"\xC3\x89", "\x84"=>"\xC3\x91", "\x85"=>"\xC3\x96", 
    "\x86"=>"\xC3\x9C", "\x87"=>"\xC3\xA1", "\x88"=>"\xC3\xA0", "\x89"=>"\xC3\xA2", 
    "\x8A"=>"\xC3\xA4", "\x8B"=>"\xC3\xA3", "\x8C"=>"\xC3\xA5", "\x8D"=>"\xC3\xA7", 
    "\x8E"=>"\xC3\xA9", "\x8F"=>"\xC3\xA8", "\x90"=>"\xC3\xAA", "\x91"=>"\xC3\xAB", 
    "\x92"=>"\xC3\xAD", "\x93"=>"\xC3\xAC", "\x94"=>"\xC3\xAE", "\x95"=>"\xC3\xAF", 
    "\x96"=>"\xC3\xB1", "\x97"=>"\xC3\xB3", "\x98"=>"\xC3\xB2", "\x99"=>"\xC3\xB4", 
    "\x9A"=>"\xC3\xB6", "\x9B"=>"\xC3\xB5", "\x9C"=>"\xC3\xBA", "\x9D"=>"\xC3\xB9", 
    "\x9E"=>"\xC3\xBB", "\x9F"=>"\xC3\xBC", "\xA0"=>"\xE2\x80\xA0", "\xA1"=>"\xC2\xB0", 
    "\xA2"=>"\xC2\xA2", "\xA3"=>"\xC2\xA3", "\xA4"=>"\xC2\xA7", "\xA5"=>"\xE2\x80\xA2", 
    "\xA6"=>"\xC2\xB6", "\xA7"=>"\xC3\x9F", "\xA8"=>"\xC2\xAE", "\xA9"=>"\xC2\xA9", 
    "\xAA"=>"\xE2\x84\xA2", "\xAB"=>"\xC2\xB4", "\xAC"=>"\xC2\xA8", "\xAD"=>"\xE2\x89\xA0", 
    "\xAE"=>"\xC3\x86", "\xAF"=>"\xC3\x98", "\xB0"=>"\xE2\x88\x9E", "\xB1"=>"\xC2\xB1", 
    "\xB2"=>"\xE2\x89\xA4", "\xB3"=>"\xE2\x89\xA5", "\xB4"=>"\xC2\xA5", "\xB5"=>"\xC2\xB5", 
    "\xB6"=>"\xE2\x88\x82", "\xB7"=>"\xE2\x88\x91", "\xB8"=>"\xE2\x88\x8F", "\xB9"=>"\xCF\x80", 
    "\xBA"=>"\xE2\x88\xAB", "\xBB"=>"\xC2\xAA", "\xBC"=>"\xC2\xBA", "\xBD"=>"\xCE\xA9", 
    "\xBE"=>"\xC3\xA6", "\xBF"=>"\xC3\xB8", "\xC0"=>"\xC2\xBF", "\xC1"=>"\xC2\xA1", 
    "\xC2"=>"\xC2\xAC", "\xC3"=>"\xE2\x88\x9A", "\xC4"=>"\xC6\x92", "\xC5"=>"\xE2\x89\x88", 
    "\xC6"=>"\xE2\x88\x86", "\xC7"=>"\xC2\xAB", "\xC8"=>"\xC2\xBB", "\xC9"=>"\xE2\x80\xA6", 
    "\xCA"=>"\xC2\xA0", "\xCB"=>"\xC3\x80", "\xCC"=>"\xC3\x83", "\xCD"=>"\xC3\x95", 
    "\xCE"=>"\xC5\x92", "\xCF"=>"\xC5\x93", "\xD0"=>"\xE2\x80\x93", "\xD1"=>"\xE2\x80\x94", 
    "\xD2"=>"\xE2\x80\x9C", "\xD3"=>"\xE2\x80\x9D", "\xD4"=>"\xE2\x80\x98", "\xD5"=>"\xE2\x80\x99", 
    "\xD6"=>"\xC3\xB7", "\xD7"=>"\xE2\x97\x8A", "\xD8"=>"\xC3\xBF", "\xD9"=>"\xC5\xB8", 
    "\xDA"=>"\xE2\x81\x84", "\xDB"=>"\xE2\x82\xAC", "\xDC"=>"\xE2\x80\xB9", "\xDD"=>"\xE2\x80\xBA", 
    "\xDE"=>"\xEF\xAC\x81", "\xDF"=>"\xEF\xAC\x82", "\xE0"=>"\xE2\x80\xA1", "\xE1"=>"\xC2\xB7", 
    "\xE2"=>"\xE2\x80\x9A", "\xE3"=>"\xE2\x80\x9E", "\xE4"=>"\xE2\x80\xB0", "\xE5"=>"\xC3\x82", 
    "\xE6"=>"\xC3\x8A", "\xE7"=>"\xC3\x81", "\xE8"=>"\xC3\x8B", "\xE9"=>"\xC3\x88", 
    "\xEA"=>"\xC3\x8D", "\xEB"=>"\xC3\x8E", "\xEC"=>"\xC3\x8F", "\xED"=>"\xC3\x8C", 
    "\xEE"=>"\xC3\x93", "\xEF"=>"\xC3\x94", "\xF0"=>"\xEF\xA3\xBF", "\xF1"=>"\xC3\x92", 
    "\xF2"=>"\xC3\x9A", "\xF3"=>"\xC3\x9B", "\xF4"=>"\xC3\x99", "\xF5"=>"\xC4\xB1", 
    "\xF6"=>"\xCB\x86", "\xF7"=>"\xCB\x9C", "\xF8"=>"\xC2\xAF", "\xF9"=>"\xCB\x98", 
    "\xFA"=>"\xCB\x99", "\xFB"=>"\xCB\x9A", "\xFC"=>"\xC2\xB8", "\xFD"=>"\xCB\x9D", 
    "\xFE"=>"\xCB\x9B", "\xFF"=>"\xCB\x87", "\x00"=>"\x20", "\x01"=>"\x20", 
    "\x02"=>"\x20", "\x03"=>"\x20", "\x04"=>"\x20", "\x05"=>"\x20", 
    "\x06"=>"\x20", "\x07"=>"\x20", "\x08"=>"\x20", "\x0B"=>"\x20", 
    "\x0C"=>"\x20", "\x0E"=>"\x20", "\x0F"=>"\x20", "\x10"=>"\x20", 
    "\x11"=>"\x20", "\x12"=>"\x20", "\x13"=>"\x20", "\x14"=>"\x20", 
    "\x15"=>"\x20", "\x16"=>"\x20", "\x17"=>"\x20", "\x18"=>"\x20", 
    "\x19"=>"\x20", "\x1A"=>"\x20", "\x1B"=>"\x20", "\x1C"=>"\x20", 
    "\1D"=>"\x20", "\x1E"=>"\x20", "\x1F"=>"\x20", "\xF0"=>"")); 

    return $str; 
} 
?> 
相關問題