2016-09-26 67 views
0

我有一個包含各種語言(包括ASCII和本地字符)的文件。我希望我的shell可以處理任何語言 - 英文,阿拉伯文,中文,日文等。設置字符編碼在Cygwin Shell中讀取多個字符集

我閱讀了'國際化'中的cygwin頁面和支持的字符集列表(如下)。另外,我已閱讀怪異字符的文件:https://cygwin.com/faq-nochunks.html#faq.using.weirdchars

Charset    Codepage 
------------------- ------------------------------------------- 
ASCII     20127 (US_ASCII) 

CP437     437 (OEM United States) 
CP720     720 (DOS Arabic) 
CP737     737 (OEM Greek) 
CP775     775 (OEM Baltic) 
CP850     850 (OEM Latin 1, Western European) 
CP852     852 (OEM Latin 2, Central European) 
CP855     855 (OEM Cyrillic) 
CP857     857 (OEM Turkish) 
CP858     858 (OEM Latin 1 + Euro Symbol) 
CP862     862 (OEM Hebrew) 
CP866     866 (OEM Russian) 
CP874     874 (ANSI/OEM Thai) 
CP932   932 (Shift_JIS, not exactly identical to SJIS) 
CP1125     1125 (OEM Ukraine) 
CP1250     1250 (ANSI Central European) 
CP1251     1251 (ANSI Cyrillic) 
CP1252     1252 (ANSI Latin 1, Western European) 
CP1253     1253 (ANSI Greek) 
CP1254     1254 (ANSI Turkish) 
CP1255     1255 (ANSI Hebrew) 
CP1256     1256 (ANSI Arabic) 
CP1257     1257 (ANSI Baltic) 
CP1258     1258 (ANSI/OEM Vietnamese) 

ISO-8859-1   28591 (ISO-8859-1) 
ISO-8859-2   28592 (ISO-8859-2) 
ISO-8859-3   28593 (ISO-8859-3) 
ISO-8859-4   28594 (ISO-8859-4) 
ISO-8859-5   28595 (ISO-8859-5) 
ISO-8859-6   28596 (ISO-8859-6) 
ISO-8859-7   28597 (ISO-8859-7) 
ISO-8859-8   28598 (ISO-8859-8) 
ISO-8859-9   28599 (ISO-8859-9) 
ISO-8859-10    - (not available) 
ISO-8859-11    - (not available) 
ISO-8859-13   28603 (ISO-8859-13) 
ISO-8859-14    - (not available) 
ISO-8859-15   28605 (ISO-8859-15) 
ISO-8859-16    - (not available) 

Big5     950 (ANSI/OEM Traditional Chinese) 
EUCCN or euc-CN   936 (ANSI/OEM Simplified Chinese) 
EUCJP or euc-JP  20932 (EUC Japanese) 
EUCKR or euc-KR   949 (EUC Korean) 
GB2312     936 (ANSI/OEM Simplified Chinese) 
GBK      936 (ANSI/OEM Simplified Chinese) 
GEORGIAN-PS    - (not available) 
KOI8-R    20866 (KOI8-R Russian Cyrillic) 
KOI8-U    21866 (KOI8-U Ukrainian Cyrillic) 
PT154     - (not available) 
SJIS     - (not available, almost, but not exactly CP932) 
TIS620 or TIS-620  874 (ANSI/OEM Thai) 

UTF-8 or utf8   65001 (UTF-8) 

我的主要問題:是否有可能有cygwin外殼同時讀取多國語言?我還沒有真正能夠找到這方面的很多。任何方向高度讚賞。

+0

默認情況下Cygwin的使用UTF-8作爲編纂。您可以使用iconv將任何代碼頁轉換爲另一個代碼頁。詳情請見'man iconv' – matzeri

回答

0

你究竟是什麼意思?

在現代Windows(Windows 10)中最近的Cygwin中,我可以讓Cygwin顯示各種字符。例如

$ env LANG=ru_RU.UTF-8 cp --help 
$ env LANG=zh_CN.UTF-8 cp --help 
$ env LANG=ja_JP.UTF-8 cp --help 

將顯示俄文,中文,日文文本等。

如果這沒有工作,你也可以用一個額外的步驟iconv做到這一點在Windows PowerShell中,儘管對於後處理輸出:

PS C:\cygwin\bin> .\env.exe LANG=zh_CN.UTF-8 .\cp.exe --help | .\iconv.exe -f UTF-8 -t UTF-16