我一直在使用正規化轉換unicode字符串以ASCII在Java這在UNIX/Linux下正常
String s = "口水雞 hello Ä";
String s1 = Normalizer.normalize(s, Normalizer.Form.NFKD);
String regex = Pattern.quote("[\\p{InCombiningDiacriticalMarks}\\p{IsLm}\\p{IsSk}]+");
String s2 = new String(s1.replaceAll(regex, "").getBytes("ascii"), "ascii");
System.out.println(s2);
System.out.println(s.length() == s2.length());
已經嘗試過,我想它在的Unix/Linux工作,
你的意思是說,正則表達式是UTF-8 – anshulkatta
我得到這個從http://stackoverflow.com/questions/15356716/how-can-我轉換unicode字符串到ascii在java中 – anshulkatta