2011-06-27 108 views
0

可能重複:
PHP: Replace umlauts with closest 7-bit ASCII equivalent in an UTF-8 string外匯牌價異國情調的字符AZ,az或0-9

我會處理的,可能會給我回數據等的外部來源:

Tōkyō à á â ã

有沒有辦法將花哨字符轉換爲標準az AZ

Tokyo a a a a

如果還有其他字符與任何字母不匹配,可以忽略它們。

是一個很大的正則表達式,所有的fromto值是唯一的方法,或者,有沒有更簡單的方法去實現它?

+0

@Pekka謝謝,找不到那個問題。投票。 – DMin

回答

3

像這樣的東西(從交響樂團CMS項目採取)應該讓你開始:

$transliterations = array(

    // Alphabetical 

    '/À/' => 'A',  '/Á/' => 'A',  '/Â/' => 'A',  '/Ã/' => 'A',  '/Ä/' => 'Ae', 
    '/Å/' => 'A',  '/Ā/' => 'A',  '/Ą/' => 'A',  '/Ă/' => 'A',  '/Æ/' => 'Ae', 
    '/Ç/' => 'C',  '/Ć/' => 'C',  '/Č/' => 'C',  '/Ĉ/' => 'C',  '/Ċ/' => 'C', 
    '/Ď/' => 'D',  '/Đ/' => 'D',  '/Ð/' => 'D',  '/È/' => 'E',  '/É/' => 'E', 
    '/Ê/' => 'E',  '/Ë/' => 'E',  '/Ē/' => 'E',  '/Ę/' => 'E',  '/Ě/' => 'E', 
    '/Ĕ/' => 'E',  '/Ė/' => 'E',  '/Ĝ/' => 'G',  '/Ğ/' => 'G',  '/Ġ/' => 'G', 
    '/Ģ/' => 'G',  '/Ĥ/' => 'H',  '/Ħ/' => 'H',  '/Ì/' => 'I',  '/Í/' => 'I', 
    '/Î/' => 'I',  '/Ï/' => 'I',  '/Ī/' => 'I',  '/Ĩ/' => 'I',  '/Ĭ/' => 'I', 
    '/Į/' => 'I',  '/İ/' => 'I',  '/IJ/' => 'Ij',  '/Ĵ/' => 'J',  '/Ķ/' => 'K', 
    '/Ł/' => 'L',  '/Ľ/' => 'L',  '/Ĺ/' => 'L',  '/Ļ/' => 'L',  '/Ŀ/' => 'L', 
    '/Ñ/' => 'N',  '/Ń/' => 'N',  '/Ň/' => 'N',  '/Ņ/' => 'N',  '/Ŋ/' => 'N', 
    '/Ò/' => 'O',  '/Ó/' => 'O',  '/Ô/' => 'O',  '/Õ/' => 'O',  '/Ö/' => 'Oe', 
    '/Ø/' => 'O',  '/Ō/' => 'O',  '/Ő/' => 'O',  '/Ŏ/' => 'O',  '/Œ/' => 'Oe', 
    '/Ŕ/' => 'R',  '/Ř/' => 'R',  '/Ŗ/' => 'R',  '/Ś/' => 'S',  '/Š/' => 'S', 
    '/Ş/' => 'S',  '/Ŝ/' => 'S',  '/Ș/' => 'S',  '/Ť/' => 'T',  '/Ţ/' => 'T', 
    '/Ŧ/' => 'T',  '/Ț/' => 'T',  '/Ù/' => 'U',  '/Ú/' => 'U',  '/Û/' => 'U', 
    '/Ü/' => 'Ue',  '/Ū/' => 'U',  '/Ů/' => 'U',  '/Ű/' => 'U',  '/Ŭ/' => 'U', 
    '/Ũ/' => 'U',  '/Ų/' => 'U',  '/Ŵ/' => 'W',  '/Ý/' => 'Y',  '/Ŷ/' => 'Y', 
    '/Ÿ/' => 'Y',  '/Y/' => 'Y',  '/Ź/' => 'Z',  '/Ž/' => 'Z',  '/Ż/' => 'Z', 
    '/Þ/' => 'T', 
    '/à/' => 'a',  '/á/' => 'a',  '/â/' => 'a',  '/ã/' => 'a',  '/ä/' => 'ae', 
    '/å/' => 'a',  '/ā/' => 'a',  '/ą/' => 'a',  '/ă/' => 'a',  '/æ/' => 'ae', 
    '/ç/' => 'c',  '/ć/' => 'c',  '/č/' => 'c',  '/ĉ/' => 'c',  '/ċ/' => 'c', 
    '/ď/' => 'd',  '/đ/' => 'd',  '/ð/' => 'd',  '/è/' => 'e',  '/é/' => 'e', 
    '/ê/' => 'e',  '/ë/' => 'e',  '/ē/' => 'e',  '/ę/' => 'e',  '/ě/' => 'e', 
    '/ĕ/' => 'e',  '/ė/' => 'e',  '/ĝ/' => 'g',  '/ğ/' => 'g',  '/ġ/' => 'g', 
    '/ģ/' => 'g',  '/ĥ/' => 'h',  '/ħ/' => 'h',  '/ì/' => 'i',  '/í/' => 'i', 
    '/î/' => 'i',  '/ï/' => 'i',  '/ī/' => 'i',  '/ĩ/' => 'i',  '/ĭ/' => 'i', 
    '/į/' => 'i',  '/ı/' => 'i',  '/ij/' => 'ij',  '/ĵ/' => 'j',  '/ķ/' => 'k', 
    '/ł/' => 'l',  '/ľ/' => 'l',  '/ĺ/' => 'l',  '/ļ/' => 'l',  '/ŀ/' => 'l', 
    '/ñ/' => 'n',  '/ń/' => 'n',  '/ň/' => 'n',  '/ņ/' => 'n',  '/ŋ/' => 'n', 
    '/ò/' => 'o',  '/ó/' => 'o',  '/ô/' => 'o',  '/õ/' => 'o',  '/ö/' => 'oe', 
    '/ø/' => 'o',  '/ō/' => 'o',  '/ő/' => 'o',  '/ŏ/' => 'o',  '/œ/' => 'oe', 
    '/ŕ/' => 'r',  '/ř/' => 'r',  '/ŗ/' => 'r',  '/ś/' => 's',  '/š/' => 's', 
    '/ş/' => 's',  '/ŝ/' => 's',  '/ș/' => 's',  '/ť/' => 't',  '/ţ/' => 't', 
    '/ŧ/' => 't',  '/ț/' => 't',  '/ù/' => 'u',  '/ú/' => 'u',  '/û/' => 'u', 
    '/ü/' => 'ue',  '/ū/' => 'u',  '/ů/' => 'u',  '/ű/' => 'u',  '/ŭ/' => 'u', 
    '/ũ/' => 'u',  '/ų/' => 'u',  '/ŵ/' => 'w',  '/ý/' => 'y',  '/ŷ/' => 'y', 
    '/ÿ/' => 'y',  '/y/' => 'y',  '/ź/' => 'z',  '/ž/' => 'z',  '/ż/' => 'z', 
    '/þ/' => 't',  '/ß/' => 'ss',  '/ſ/' => 'ss',  '/ƒ/' => 'f',  '/ĸ/' => 'k', 
    '/ʼn/' => 'n', 

    // Symbolic 

    '/\(/' => null,  '/\)/' => null,  '/,/' => null, 
    '/–/' => '-',  '/-/' => '-',  '/„/' => '"', 
    '/「/' => '"',  '/」/' => '"',  '/—/' => '-', 
    '/¿/' => null,  '/‽/' => null,  '/¡/' => null, 

    // Ampersands 

    '/©/' => 'c', 
    '/^&(?!&)$/' => 'and', 
    '/^&(?!&)/' => 'and-', 
    '/&(?!&)&/' => '-and', 
    '/&(?!&)/' => '-and-', 

); 

您還可以使用iconv,但這並不是完美無缺,Ü例如,將得到返回"U,而它應該返回爲Ue

+0

雖然,'U'到'Ue'的轉換是* only *理由使用iconv。 –

+0

確實,@DMin應該決定是否值得開銷。 – Bjorn

+0

如果有人想看看這裏的原始類(它沒有這個評論的逃脫)https://github.com/symphonycms/symphony-2/blob/master/symphony/lib/lang/ transliterations.php – robjbrain

相關問題