2013-04-29 33 views
6

我需要做一些相當簡單的事情,但沒有哈希映射硬編碼。從西里爾文到拉丁語ICU4j的音譯java

我有一個字符串s,它在西里爾文我需要某種例子,如何使用一種自定義過濾器將它變成拉丁字符排序(給一個純粹的拉丁例子,不要混淆任何人如果字符串s = sniff;我希望它能夠查找嗅探並將它們變成其他東西(可能還有組合)

我可以看到ICU4j可以做這種事情,但我不知道如何實現它,因爲我找不到任何工作示例(或我只是愚蠢)。

任何幫助表示讚賞。

感謝

最好的問候,

PS我需要批量翻譯。我不關心風格或動態音譯,只是一些關於ICU4j批量音譯器的樣子的基本示例。

K我真的明白了。

import com.ibm.icu.text.Transliterator; 


public class BulgarianToLatin { 


    public static String BULGARIAN_TO_LATIN = "Bulgarian-Latin/BGN"; 

    public static void main(String[] args) { 
     String bgString = "Джокович"; 

     Transliterator bulgarianToLatin = Transliterator.getInstance(BULGARIAN_TO_LATIN); 
     String result1 = bulgarianToLatin.transliterate(bgString); 
     System.out.println("Bulgarian to Latin:" + result1); 

    } 

} 

也爲基於規則的音譯一個最後的編輯(如果你不希望使用現有的前一次或只是想要一些定製)

import com.ibm.icu.text.Transliterator; 

public class BulgarianToLatin { 


    public static String BULGARIAN_TO_LATIN = "Bulgarian-Latin/BGN"; 

    public static void main(String[] args) { 
     String bgString = "а б в г д е ж з и й к л м н о п р с т у ф х ц ч ш щ ю я \n Юлиян Джокович"; 

     String rules="::[А-ЪЬЮ-ъьюяѢѣѪѫ];" + 
     "Б > B;" + 
     "б > b;" + 
     "В > V;" + 
     "ТС > TS;" + 
     "Тс > Ts;" + 
     "ч > ch;" + 
     "ШТ > SHT;" + 
     "Шт > Sht;" + 
     "шт > sht;" + 
     "{Ш}[[б-джзй-нп-тф-щь][аеиоуъюяѣѫ]] > Sh;" + 
     "Я > YA;" + 
     "я > ya;"; 
     Transliterator bulgarianToLatin = Transliterator.createFromRules("temp", rules, Transliterator.FORWARD); 

     String result1 = bulgarianToLatin.transliterate(bgString); 
     System.out.println("Bulgarian to Latin:" + result1); 

    } 

} 

回答

4

我已經寫了一個方法,將西里爾文譯成拉丁文,也許這對smb有用。

public static String transliterate(String message){ 
    char[] abcCyr = {' ','а','б','в','г','д','е','ё', 'ж','з','и','й','к','л','м','н','о','п','р','с','т','у','ф','х', 'ц','ч', 'ш','щ','ъ','ы','ь','э', 'ю','я','А','Б','В','Г','Д','Е','Ё', 'Ж','З','И','Й','К','Л','М','Н','О','П','Р','С','Т','У','Ф','Х', 'Ц', 'Ч','Ш', 'Щ','Ъ','Ы','Ь','Э','Ю','Я','a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'}; 
    String[] abcLat = {" ","a","b","v","g","d","e","e","zh","z","i","y","k","l","m","n","o","p","r","s","t","u","f","h","ts","ch","sh","sch", "","i", "","e","ju","ja","A","B","V","G","D","E","E","Zh","Z","I","Y","K","L","M","N","O","P","R","S","T","U","F","H","Ts","Ch","Sh","Sch", "","I", "","E","Ju","Ja","a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z","A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"}; 
    StringBuilder builder = new StringBuilder(); 
    for (int i = 0; i < message.length(); i++) { 
     for (int x = 0; x < abcCyr.length; x++) { 
      if (message.charAt(i) == abcCyr[x]) { 
       builder.append(abcLat[x]); 
      } 
     } 
    } 
    return builder.toString(); 
} 
+0

對於簡單的應用很有用。謝謝! – 2017-02-24 21:18:15

+1

你在'abcCyr'數組中有一個錯字,而不是你寫'Б'的'Ü'。 – 2017-07-26 02:54:27

+0

謝謝,編輯! – lxknvlk 2017-08-02 09:48:30

相關問題