簡單的解決方法:
第一個字符串是唯一的八進制ISO-8859-1,而第二個是雙削減ISO-8859-1混合UTF-16字符(?爲什麼現在這是個問題) 。下面的代碼採用八進制代碼,轉換爲十六進制,將它們打包爲二進制,並將它們編碼爲utf-8。 utf-16代碼已經是十六進制的,所以它們只被打包並編碼爲utf-8。
有關字符集未來的信息參考:http://www.fileformat.info/info/charset/index.htm
<?php
$string = "Tak hur\341 v posteli po pr\341ci a jde se sp\355nkat";
$string2 = "Som nen\\355 ja len chodiaca kapuc\\341 pra\\u0161iva ignorujuca";
print decode_str($string2)."<br>";
print decode_str($string);
function decode_str($string){
return utf16_to_utf8(iso_to_utf8($string));
}
function iso_to_utf8($string){
preg_match_all('#\\\\[0-9]{3}#',$string,$matches);
foreach($matches[0] as $match){
$char = preg_replace("#(\\\)#","",$match);
$a = pack("H*" , base_convert($char,8,16));
$string = preg_replace('#(\\\\)'.$char.'#',$a,$string);
}
return mb_convert_encoding($string,"UTF-8","ISO-8859-1");
}
function utf16_to_utf8($string){
preg_match_all('#\\\u[a-z0-9]{4}#',$string,$matches);
foreach($matches[0] as $match){
$char = preg_replace("#\\\\u#","",$match);
$a = pack("H*" , $char);
$a = mb_convert_encoding($a,"UTF-8","UTF-16");
$string = preg_replace('#'.preg_quote($match).'#',$a,$string);
}
return $string;
}
?>
稍有變化(如注意到這個問題的作者):preg_match_all在utf16_to_utf8功能應具有u [A-f0-9],爲UTF使用十六進制數字。 – vinascz