2012-06-18 32 views
1

我試圖在我的ANT腳本中支持UTF-8字符。如何在ANT中支持3字節的UTF-8字符

只要字符串是由2字節的UTF-8字符,如:

  • 登錄
  • 用戶ID

然後,事情很好地工作。

當我使用Unicode漢性格:

其中,根據這個網站: http://www.fileformat.info/info/unicode/char/6211/index.htm 擁有0xE6均爲0x88的UTF-8編碼0x91

我可以在UltraEdit中看到,我的輸入屬性文件的所有值都是「E6 88 91」,所以我相當確信我的輸入是正確的。當我在Notepad ++中打開同一個文件時,我可以正確地看到所有字符。

這裏是我的構建腳本:

<?xml version="1.0" encoding="UTF-8" ?> 

<project 
    name="utf8test" 
    default="all" 
    basedir="."> 

    <target name="all"> 
     <loadproperties encoding="UTF-8" srcfile="./apps.properties.all.txt" /> 

     <echo>No encoding ${common.app.name}</echo> 
     <echo encoding="UTF-8">UTF-8 ${common.app.name}</echo> 
     <echo encoding="UnicodeLittle">UnicodeLittle ${common.app.name}</echo> 
     <echo encoding="UnicodeLittleUnmarked">UnicodeLittleUnmarked ${common.app.name}</echo> 
     <echo>${common.app.ServerName}</echo> 
     <echo>${bb.vendor}</echo> 

     <echo>No encoding ${common.app.UserIdText}</echo> 
     <echo encoding="UTF-8">UTF-8 ${common.app.UserIdText}</echo> 
     <echo encoding="UnicodeLittle">UnicodeLittle ${common.app.UserIdText}</echo> 
     <echo encoding="UnicodeLittleUnmarked">UnicodeLittleUnmarked ${common.app.UserIdText}</echo> 

     <echoproperties /> 
     </target> 
    </project> 

,這裏是我的屬性文件:

common.app=VrvPsLTst 
common.app.name=我們 
common.app.description=Pseudo Loc Test App for Build Script testing 
common.app.ServerName=http://Vèrìvò.com 
bb.vendor=Vèrìvò 
common.app.PasswordText=Pàsswòrð 
bb.override.list=MP_COPYRIGHTTEXT, "Çòpÿrìght 2012 Vèrívó Bùîlð TéàM" 
common.app.LoginButtonText=Lògìñ 
common.app.UserIdText=Ùsèr ÌÐ 
bb.SMSSuccess=Mèssàgéß Sùççêssfúllÿ Sëñt 
common.app.LoginScreenMessage=WèlçòMé Mêssàgë 
common.app.LoginProgressMessage=Àùthèñtìçàtíòñ îñ prógréss... 
ios.RegistrationText=Règìstràtíòñ Téxt 
ios.RegistrationURL=http://www.josscrowcroft.com/2011/code/utf-8-multibyte-characters-in-url-parameters-%E2%9C%93/ 

這裏是輸出的樣子:

Buildfile: C:\Temp\utf8\build.xml 

all: 
    [echo] No encoding ?? 
    [echo] UTF-8 ?? 
    [echo] ÿþU n i c o d e L i t t l e ? ? 
    [echo] U n i c o d e L i t t l e U n m a r k e d ? ? 
    [echo] http://Vèrìvò.com 
    [echo] Vèrìvò 
    [echo] No encoding Ùsèr ÌÐ 
    [echo] UTF-8 Ùsèr Ì� 
    [echo] ÿþU n i c o d e L i t t l e Ù s è r Ì Ð 
    [echo] U n i c o d e L i t t l e U n m a r k e d Ù s è r Ì Ð 
[echoproperties] #Ant properties 
[echoproperties] #Mon Jun 18 15:25:13 EDT 2012 
[echoproperties] ant.core.lib=C\:\\ant\\lib\\ant.jar 
[echoproperties] ant.file=C\:\\Temp\\utf8\\build.xml 
[echoproperties] ant.file.type=file 
[echoproperties] ant.file.type.utf8test=file 
[echoproperties] ant.file.utf8test=C\:\\Temp\\utf8\\build.xml 
[echoproperties] ant.home=c\:\\ant\\bin\\.. 
[echoproperties] ant.java.version=1.6 
[echoproperties] ant.library.dir=C\:\\ant\\lib 
[echoproperties] ant.project.default-target=all 
[echoproperties] ant.project.invoked-targets=all 
[echoproperties] ant.project.name=utf8test 
[echoproperties] ant.version=Apache Ant version 1.8.1 compiled on April 30 2010 
[echoproperties] awt.toolkit=sun.awt.windows.WToolkit 
[echoproperties] basedir=C\:\\Temp\\utf8 
[echoproperties] bb.SMSSuccess=M\u00E8ss\u00E0g\u00E9\u00DF S\u00F9\u00E7\u00E7\u00EAssf\u00FAll\u00FF S\u00EB\u00F1t 
[echoproperties] bb.override.list=MP_COPYRIGHTTEXT, "\u00C7\u00F2p\u00FFr\u00ECght 2012 V\u00E8r\u00EDv\u00F3 B\u00F9\u00EEl\u00F0 T\u00E9\u00E0?" 
[echoproperties] bb.vendor=V\u00E8r\u00ECv\u00F2 
[echoproperties] common.app=VrvPsLTst 
[echoproperties] common.app.LoginButtonText=L\u00F2g\u00EC\u00F1 
[echoproperties] common.app.LoginProgressMessage=\u00C0\u00F9th\u00E8\u00F1t\u00EC\u00E7\u00E0t\u00ED\u00F2\u00F1 \u00EE\u00F1 pr\u00F3gr\u00E9ss... 
[echoproperties] common.app.LoginScreenMessage=W\u00E8l\u00E7\u00F2?\u00E9 M\u00EAss\u00E0g\u00EB 
[echoproperties] common.app.PasswordText=P\u00E0ssw\u00F2r\u00F0 
[echoproperties] common.app.ServerName=http\://V\u00E8r\u00ECv\u00F2.com 
[echoproperties] common.app.UserIdText=\u00D9s\u00E8r \u00CC\u00D0 
[echoproperties] common.app.description=Pseudo Loc Test App for Build Script testing 
[echoproperties] common.app.name=?? 
[echoproperties] file.encoding=Cp1252 
[echoproperties] file.encoding.pkg=sun.io 
[echoproperties] file.separator=\\ 
[echoproperties] ios.RegistrationText=R\u00E8g\u00ECstr\u00E0t\u00ED\u00F2\u00F1 T\u00E9xt 
[echoproperties] ios.RegistrationURL=http\://www.josscrowcroft.com/2011/code/utf-8-multibyte-characters-in-url-parameters-%E2%9C%93/ 
[echoproperties] java.awt.graphicsenv=sun.awt.Win32GraphicsEnvironment 
[echoproperties] java.awt.printerjob=sun.awt.windows.WPrinterJob 
[echoproperties] java.class.path=c\:\\ant\\bin\\..\\lib\\ant-launcher.jar;C\:\\Temp\\utf8\\.\\;C\:\\Program Files (x86)\\Java\\jre7\\lib\\ext\\QTJava.zip;C\:\\ant\\lib\\ant-antlr.jar;C\:\\ant\\lib\\ant-apache-bcel.jar;C\:\\ant\\lib\\ant-apache-bsf.jar;C\:\\ant\\lib\\ant-apache-log4j.jar;C\:\\ant\\lib\\ant-apache-oro.jar;C\:\\ant\\lib\\ant-apache-regexp.jar;C\:\\ant\\lib\\ant-apache-resolver.jar;C\:\\ant\\lib\\ant-apache-xalan2.jar;C\:\\ant\\lib\\ant-commons-logging.jar;C\:\\ant\\lib\\ant-commons-net.jar;C\:\\ant\\lib\\ant-contrib-1.0b3.jar;C\:\\ant\\lib\\ant-jai.jar;C\:\\ant\\lib\\ant-javamail.jar;C\:\\ant\\lib\\ant-jdepend.jar;C\:\\ant\\lib\\ant-jmf.jar;C\:\\ant\\lib\\ant-jsch.jar;C\:\\ant\\lib\\ant-junit.jar;C\:\\ant\\lib\\ant-launcher.jar;C\:\\ant\\lib\\ant-netrexx.jar;C\:\\ant\\lib\\ant-nodeps.jar;C\:\\ant\\lib\\ant-starteam.jar;C\:\\ant\\lib\\ant-stylebook.jar;C\:\\ant\\lib\\ant-swing.jar;C\:\\ant\\lib\\ant-testutil.jar;C\:\\ant\\lib\\ant-trax.jar;C\:\\ant\\lib\\ant-weblogic.jar;C\:\\ant\\lib\\ant.jar;C\:\\ant\\lib\\bb-ant-tools.jar;C\:\\ant\\lib\\xercesImpl.jar;C\:\\ant\\lib\\xml-apis.jar;C\:\\Program Files\\Java\\jre7\\lib\\tools.jar 
[echoproperties] java.class.version=51.0 
[echoproperties] java.endorsed.dirs=C\:\\Program Files\\Java\\jre7\\lib\\endorsed 
[echoproperties] java.ext.dirs=C\:\\Program Files\\Java\\jre7\\lib\\ext;C\:\\Windows\\Sun\\Java\\lib\\ext 
[echoproperties] java.home=C\:\\Program Files\\Java\\jre7 
[echoproperties] java.io.tmpdir=C\:\\Users\\efelton\\AppData\\Local\\Temp\\ 
[echoproperties] java.library.path=C\:\\Windows\\SYSTEM32;C\:\\Windows\\Sun\\Java\\bin;C\:\\Windows\\system32;C\:\\Windows;C\:\\Windows\\SYSTEM32;C\:\\Windows;C\:\\Windows\\SYSTEM32\\WBEM;C\:\\Windows\\SYSTEM32\\WINDOWSPOWERSHELL\\V1.0\\;C\:\\PROGRAM FILES\\INTEL\\WIFI\\BIN\\;C\:\\PROGRAM FILES\\COMMON FILES\\INTEL\\WIRELESSCOMMON\\;C\:\\PROGRAM FILES (X86)\\MICROSOFT SQL SERVER\\100\\TOOLS\\BINN\\;C\:\\PROGRAM FILES\\MICROSOFT SQL SERVER\\100\\TOOLS\\BINN\\;C\:\\PROGRAM FILES\\MICROSOFT SQL SERVER\\100\\DTS\\BINN\\;C\:\\PROGRAM FILES (X86)\\MICROSOFT SQL SERVER\\100\\TOOLS\\BINN\\VSSHELL\\COMMON7\\IDE\\;C\:\\PROGRAM FILES (X86)\\MICROSOFT SQL SERVER\\100\\DTS\\BINN\\;C\:\\Program Files\\ThinkPad\\Bluetooth Software\\;C\:\\Program Files\\ThinkPad\\Bluetooth Software\\syswow64;C\:\\Program Files (x86)\\QuickTime\\QTSystem\\;C\:\\Program Files (x86)\\AccuRev\\bin;C\:\\Program Files\\Java\\jdk1.7.0_04\\bin;C\:\\Program Files (x86)\\IDM Computer Solutions\\UltraEdit\\;. 
[echoproperties] java.runtime.name=Java(TM) SE Runtime Environment 
[echoproperties] java.runtime.version=1.7.0_04-b22 
[echoproperties] java.specification.name=Java Platform API Specification 
[echoproperties] java.specification.vendor=Oracle Corporation 
[echoproperties] java.specification.version=1.7 
[echoproperties] java.vendor=Oracle Corporation 
[echoproperties] java.vendor.url=http\://java.oracle.com/ 
[echoproperties] java.vendor.url.bug=http\://bugreport.sun.com/bugreport/ 
[echoproperties] java.version=1.7.0_04 
[echoproperties] java.vm.info=mixed mode 
[echoproperties] java.vm.name=Java HotSpot(TM) 64-Bit Server VM 
[echoproperties] java.vm.specification.name=Java Virtual Machine Specification 
[echoproperties] java.vm.specification.vendor=Oracle Corporation 
[echoproperties] java.vm.specification.version=1.7 
[echoproperties] java.vm.vendor=Oracle Corporation 
[echoproperties] java.vm.version=23.0-b21 
[echoproperties] line.separator=\r\n 
[echoproperties] os.arch=amd64 
[echoproperties] os.name=Windows 7 
[echoproperties] os.version=6.1 
[echoproperties] path.separator=; 
[echoproperties] sun.arch.data.model=64 
[echoproperties] sun.boot.class.path=C\:\\Program Files\\Java\\jre7\\lib\\resources.jar;C\:\\Program Files\\Java\\jre7\\lib\\rt.jar;C\:\\Program Files\\Java\\jre7\\lib\\sunrsasign.jar;C\:\\Program Files\\Java\\jre7\\lib\\jsse.jar;C\:\\Program Files\\Java\\jre7\\lib\\jce.jar;C\:\\Program Files\\Java\\jre7\\lib\\charsets.jar;C\:\\Program Files\\Java\\jre7\\lib\\jfr.jar;C\:\\Program Files\\Java\\jre7\\classes 
[echoproperties] sun.boot.library.path=C\:\\Program Files\\Java\\jre7\\bin 
[echoproperties] sun.cpu.endian=little 
[echoproperties] sun.cpu.isalist=amd64 
[echoproperties] sun.desktop=windows 
[echoproperties] sun.io.unicode.encoding=UnicodeLittle 
[echoproperties] sun.java.command=org.apache.tools.ant.launch.Launcher -cp .;C\:\\Program Files (x86)\\Java\\jre7\\lib\\ext\\QTJava.zip 
[echoproperties] sun.java.launcher=SUN_STANDARD 
[echoproperties] sun.jnu.encoding=Cp1252 
[echoproperties] sun.management.compiler=HotSpot 64-Bit Tiered Compilers 
[echoproperties] sun.os.patch.level=Service Pack 1 
[echoproperties] user.country=US 
[echoproperties] user.dir=C\:\\Temp\\utf8 
[echoproperties] user.home=C\:\\Users\\efelton 
[echoproperties] user.language=en 
[echoproperties] user.name=efelton 
[echoproperties] user.script= 
[echoproperties] user.timezone= 
[echoproperties] user.variant= 

BUILD SUCCESSFUL 
Total time: 1 second 

感謝您幫助

編輯\更新2012年6月19日

我正在Windows環境中開發。

我已經安裝了從TTF: http://freedesktop.org/wiki/Software/CJKUnifonts/Download

我已經更新UltraEdit中使用TTF,我可以看到中國漢字。

<?xml version="1.0" encoding="UTF-8" ?> 

<project name="utf8test" default="all" basedir="."> 

    <target name="all">   

     <echo>我們</echo> 
     <echo encoding="ISO-8859-1">ISO-8859-1 我們</echo> 
     <echo encoding="UTF-8">UTF-8 我們</echo> 


     <echo file="echo_output.txt" append="true" >我們 ${line.separator}</echo> 
     <echo file="echo_output.txt" append="true" encoding="ISO-8859-1">ISO-8859-1 我們 ${line.separator}</echo> 
     <echo file="echo_output.txt" append="true" encoding="UTF-8">UTF-8 我們 ${line.separator}</echo> 
     <echo file="echo_output.txt" append="true" encoding="UnicodeLittle">UnicodeLittle 我們 ${line.separator}</echo> 
     <echo file="echo_output.txt" append="true" encoding="UnicodeLittleUnmarked">UnicodeLittleUnmarked 我們 ${line.separator}</echo> 

    </target> 
</project> 

由內而外的UltraEdit運行捕獲輸出是: 構建文件:E:\ TEMP \ UTF8 \ build.xml文件

all: 
     [echo] ?? 
     [echo] ISO-8859-1 ?? 
     [echo] UTF-8 ?? 

    BUILD SUCCESSFUL 
    Total time: 1 second 

而且echo_output.txt文件顯示像這樣:

?? 
    ISO-8859-1 ?? 
    UTF-8 ?? 
    ÿþU n i c o d e L i t t l e ? ? 

    U n i c o d e L i t t l e U n m a r k e d ? ? 

因此,我的ANT環境設置起來似乎有些根本性的錯誤,因爲我不能簡單地將字符回顯到屏幕或文件。

回答

0

我通過了所有輸入(屬性文件)通過這個編碼器首先解決了我在Windows和MacOS的問題。然後,ANT可以正確讀取,然後寫入輸入以這種方式轉義的值。

StringBuffer buffer = new StringBuffer(); 
try 
{ 
      FileInputStream fis = new FileInputStream(input); 
      InputStreamReader isr = new InputStreamReader(fis, "UTF8"); 
      Reader in = new BufferedReader(isr);  
      int ch; 
      while ((ch = in.read()) > -1) 
      { 
       if (ch > 127 || ch < 0) 
       { 
        String hex = Integer.toHexString(ch); 
        switch (hex.length()) 
        { 
         case 1: 
         buffer.append("\\u000"); 
         break; 
         case 2: 
         buffer.append("\\u00"); 
         break; 
         case 3: 
         buffer.append("\\u0"); 
         break; 
         case 4: 
         default: 
         buffer.append("\\u"); 
         break; 
        } 
        buffer.append(hex); 
       } 
       else if (ch != 0) 
       { 
        buffer.append((char) ch); 
       } 
      }//while 
      in.close(); 

     //System.out.println(buffer.toString()); 

    }//try 
    catch (IOException e) 
    { 
      //e.printStackTrace(); 
      throw new BuildException(e.getMessage()); 
    } 
try 
    { 
     FileOutputStream fos = new FileOutputStream(dest); 
     Writer out = new OutputStreamWriter(fos, "windows-1252"); 
     out.write(buffer.toString()); 
     out.close(); 
    } 
     catch (IOException e) 
    { 
     throw new BuildException(e.getMessage()); 
    } 
0

java.util.Properites類使用ISO 8859-1編碼。當使用Ant 1.8.2進行測試時,以下工作。

構建。XML

<?xml version="1.0" encoding="UTF-8" ?> 
<project name="utf8test" default="all" basedir="."> 

<target name="all"> 
    <loadproperties encoding="ISO-8859-1" srcfile="./apps.properties.all.txt" /> 

    <echo>No encoding ${common.app.name}</echo> 
    <echo encoding="ISO-8859-1">ISO-8859-1 ${common.app.name}</echo> 
</target> 
</project> 

輸出

all: 
    [echo] No encoding æå 
    [echo] ISO-8859-1 我們 

BUILD SUCCESSFUL 
+1

我下載了ant 1.8.4,並且運行了你的腳本,並且得到了'[echo] ISO-8859-1??? ??'作爲輸出。我的JAVA版本會影響我的結果嗎? – efelton

+0

您看到的問題與Windows命令行有關。請參閱[使用cmd.exe的什麼編碼/代碼頁](http://stackoverflow.com/questions/1259084/what-encoding-code-page-is-cmd-exe-using)和[如何爲更多字體啓用Windows命令提示符](http://www.mydigitallife.info/how-to-enable-more-fonts-for-windows-command-prompt/) –

+1

@efelton使用Linux,上述解決方案「開箱即用」 。在Windows 7上測試時,我發現即使在使用'chcp 65001'將代碼頁設置爲UTF-8後,仍然沒有使用字體'Lucida Console'正確顯示漢字。我在Windows上找到的最簡單的解決方案是在Intellij IDEA中運行Ant構建並回顯到一個文件。 –

相關問題