2012-03-14 33 views
2

工作,甚至當我鍵入PHP strtr函數的效率完全不

echo strtr("-äåö-", "äåö", "xxx"); 

當我使用它下面的例子沒什麼不翻譯它不能正常工作,它輸出該>xxx¥x¶<,但在所有它讓原來的曼波珍寶。 如果我輸入ÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÑŐŰÜÒÓÔÕÖØÝߟàáâãäåçèéêëìíîïñòóôõőöøšűùúûüýÿž的形式,並點擊翻譯它將取代相同的字符串,æ œ根本不會翻譯。

<form method="POST"> 
    <input style="width:500px;" type="text" name="first_name" /> 
    <input style="width:500px;" type="text" name="last_name" /> 
    <input type="submit" name="submit" value="translate" /> 
</form> 


<?php 

    $dict = array(
        "Æ" => "AE", 
        "æ" => "ae", 
        "Œ" => "OE", 
        "œ" => "oe" 
       ); 

    $first = strtr($_POST['first_name'], $dict);   
    $last = strtr($_POST['last_name'], $dict);  


    $first = strtr($first, 
          "ÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÑŐŰÜÒÓÔÕÖØÝߟàáâãäåçèéêëìíîïñòóôõőöøšűùúûüýÿž", 
          "AAAAAACEEEEIIIINOUUOOOOOOYSYaaaaaaceeeeiiiinooooooosuuuuuyyz"); 

    $last = strtr($last, 
          "ÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÑŐŰÜÒÓÔÕÖØÝߟàáâãäåçèéêëìíîïñòóôõőöøšűùúûüýÿž", 
          "AAAAAACEEEEIIIINOUUOOOOOOYSYaaaaaaceeeeiiiinooooooosuuuuuyyz"); 

    echo $first." --- "; 
    echo $last; 
?> 

即使我添加的代碼頂部

foreach ($_POST as $key => $value) { 
    $POST[$key] = iconv(mb_detect_encoding($_POST["first_name"]), "ASCII//TRANSLIT", $POST[$value]); 
} 

並粘貼AAAAAACEEEEIIIINOUUOOOOOOYSYaaaaaaceeeeiiiinooooooosuuuuuyyz它出來像這樣yAyAyAyEyEyIyIyNyUyOyOyOyYyYyayauaueyeyiyiynyoyoyoysyuuuyyyzy�y�y�y�y�y�y�y�y�y�y�y�y�y�y�y�y�y�uay�yuuzu�y�y�y�y�y�y�u�

沒關係,因爲沒有人知道爲什麼,我只是用str_replace不起作用和str_ireplace非常成功,不需要任何關於編碼的擔憂。

編輯:我的不好的編碼對str_replace也有影響。我使用的HTML頁面上

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> 
+3

你介意我問你爲什麼覺得更換有必要有效的UTF-8字符與ASCII? – Borealid 2012-03-14 23:14:54

+0

我不太瞭解字符編碼。所以如果我得到了它不會是UTF-8它會是什麼編碼? – JohnA 2012-03-15 00:27:58

回答

1

你嘗試mb_strstr:http://php.net/manual/en/function.mb-strstr.php

該功能支持多字節字符編碼。

+0

他在談論'strtr()',而不是'strstr()'。 – rid 2012-03-14 23:21:21

+0

我的目標是使用「strtr」,我不知道什麼是單字節或多字節字符編碼,我該如何改變它? – JohnA 2012-03-14 23:25:17

1

這聽起來像你可能有競爭的編碼。如果您的瀏覽器正在提交UTF8,但您的文件保存在(例如)8859-1中,則您的角色不匹配,翻譯將失敗。另外,看看the doc page,有幾條評論建議首先在輸入字符串上使用utf8_decode()。有可能utf8_decode()本身會做你想做的。

UTF8是一個多字節編碼(實際上,它是一個可變字節編碼)。諸如÷ï之類的字符具有超過256的Unicode代碼點,其需要被編碼成識別字符的兩個或更多個字節,兩個或更多個字符都高於128。我懷疑你將不得不更多地瞭解Unicode。在utf8_encode有另一種解釋。

編輯:這是一段時間,因爲我已經與編碼摔跤。您應該查看iconv()以獲得更通用的重新編碼。

3

strtr與函數原型

string strtr (string $str , string $from , string $to) 

並不只工作與單字節編碼(例如ISO-8859-1)可靠。

header("Content-Type: text/plain; charset=ISO-8859-1"); 
$str = "\x2d\xe4\xe5\xf6\x2d"; // ISO-8859-1: -äåö- 
$from = "\xe4\xe5\xf6";  // ISO-8859-1: äåö 
$to = "\x78\x78\x78";   // ISO-8859-1: xxx 
dump($str, "ISO-8859-1"); // length in octets: 5 
dump($from, "ISO-8859-1"); // length in octets: 3 
dump($to, "ISO-8859-1"); // length in octets: 3 

print strtr($str, $from, $to); // -xxx- 

輸出:

-: 2d 
ä: e4 
å: e5 
ö: f6 
-: 2d 
length (encoding: ISO-8859-1): 5 
length in octets (8-bit-byte): 5 

ä: e4 
å: e5 
ö: f6 
length (encoding: ISO-8859-1): 3 
length in octets (8-bit-byte): 3 

x: 78 
x: 78 
x: 78 
length (encoding: ISO-8859-1): 3 
length in octets (8-bit-byte): 3 

-xxx- 

如果使用多字節字符如從UTF-8,你可能會得到搞砸字符串:

header("Content-Type: text/plain; charset=UTF-8"); 
$str = "\x2d\xc3\xa4\xc3\xa5\xc3\xb6\x2d"; // UTF-8: -äåö- 
$from = "\xc3\xa4\xc3\xa5\xc3\xb6";  // UTF-8: äåö 
$to = "\x78\x78\x78";      // UTF-8: xxx 
dump($str, "UTF-8"); // length in octets: 8 
dump($from, "UTF-8"); // length in octets: 6 
dump($to, "UTF-8"); // length in octets: 3 

// > If from and to have different lengths, the extra characters in the longer 
// > of the two are ignored. The length of str will be the same as the return 
// > value's. 
// http://de.php.net/manual/en/function.strtr.php 

// This means that the $from-string gets cropped to "\xc3\xa4\xc3" (16 bit of 
// the first char [ä] and the first 8 bit of the second char [å]): 
strtr($str, $from, $to) === strtr($str, "\xc3\xa4\xc3", $to); // true 
print strtr($str, $from, $to); // -xxx�x�- 

輸出:

-: 2d 
ä: c3a4 
å: c3a5 
ö: c3b6 
-: 2d 
length (encoding: UTF-8): 5 
length in octets (8-bit-byte): 8 

ä: c3a4 
å: c3a5 
ö: c3b6 
length (encoding: UTF-8): 3 
length in octets (8-bit-byte): 6 

x: 78 
x: 78 
x: 78 
length (encoding: UTF-8): 3 
length in octets (8-bit-byte): 3 

-xxx�x�- 

對於多字節編碼如UTF-8,你必須使用第二個函數原型:

string strtr (string $str , array $replace_pairs) 
header("Content-Type: text/plain"); 
$str = "-äåö-"; // UTF-8 \x2d\xc3\xa4\xc3\xa5\xc3\xb6\x2d 
$replace_pairs = array(
    "ä" /* UTF-8 \xc3\xa4 */ => "x", 
    "å" /* UTF-8 \xc3\xa5 */ => "x", 
    "ö" /* UTF-8 \xc3\xb6 */ => "x" 
); 
print strtr($str, $replace_pairs); // -xxx- 

如果編碼不匹配,你必須向他們iconv轉換:

header("Content-Type: text/plain"); 
$str = "\x2d\xe4\xe5\xf6\x2d"; // ISO-8859-1 -äåö- 
$str = iconv("ISO-8859-1", "UTF-8", $str); 
$replace_pairs = array(
    "ä" /* UTF-8 \xc3\xa4 */ => "x", 
    "å" /* UTF-8 \xc3\xa5 */ => "x", 
    "ö" /* UTF-8 \xc3\xb6 */ => "x" 
); 
print strtr($str, $replace_pairs); // -xxx- 

功能轉儲:

// outputs the hexvalue for each char for the given encoding 
function dump($data, $encoding) { 
    for($i = 0, $len = iconv_strlen($data, $encoding); $i < $len; ++$i) { 
     $char = iconv_substr($data, $i, 1, $encoding); 
     printf("%s: %s\n", $char, bin2hex($char)); 
    } 
    printf("length (encoding: %s): %d\n", $encoding, $len); 
    printf("length in octets (8-bit-byte): %d\n\n", strlen($data)); 
} 
+0

'strtr'的第二個函數原型是'str_replace()'的別名? – 2014-10-17 11:11:25