的Python：有沒有辦法從編解碼器名稱

獲得語言的名字，我想是可以這樣做的以下功能：的Python：有沒有辦法從編解碼器名稱

def get_lang(enc): 
    ... 

>>> get_lang('ascii') 
'English' 
>>> get_lang('big5') 
'Traditional Chinese' 
>>> get_lang('utf-8') 
'All languages'

這可能嗎？

更新：我的意思是，就像這本手冊中：http://docs.python.org/library/codecs.html#standard-encodings

來源

2012-07-04 jayven

也許你會發現[本文]（http://www.joelonsoftware.com/articles/Unicode.html）有幫助。 – sloth

@BigYellowCactus是的，thx爲您的答覆。但我想我已經清楚了。我知道，例如英語和德語（子集）都使用「ascii」。因此編解碼器和語言之間沒有精確的對應關係。我需要的僅僅是'......可能使用編碼的語言......'，正如手冊中所說的那樣。 – jayven

沒有，沒有包含該信息的任何廣泛可用的模塊。

我建議解析文檔（從http://hg.python.org/cpython/file/tip/Doc/library/codecs.rst#l926或http://docs.python.org/_sources/library/codecs.txt）構建查找字典，就像@fraxel似乎已經完成的那樣。由於編解碼器很少被添加到Python中（並且越來越多地被Unicode替換），這種方法應該是合理的未來發展。

來源

2012-07-04 09:11:41 ecatmur

是的，你可以使用這本字典：

codec_langs = {'ascii':'English', 
'big5': 'Traditional Chinese', 
'big5hkscs':'Traditional Chinese', 
'cp037':'English', 
'cp424':'Hebrew', 
'cp437':'English', 
'cp500':'Western Europe', 
'cp720':'Arabic', 
'cp737':'Greek', 
'cp775':'Baltic languages', 
'cp850':'Western Europe', 
'cp852':'Central and Eastern Europe', 
'cp855':'Bulgarian, Byelorussian, Macedonian, Russian, Serbian', 
'cp856':'Hebrew', 
'cp857':'Turkish', 
'cp858':'Western Europe', 
'cp860':'Portuguese', 
'cp861':'Icelandic', 
'cp862':'Hebrew', 
'cp863':'Canadian', 
'cp864':'Arabic', 
'cp865':'Danish, Norwegian', 
'cp866':'Russian', 
'cp869':'Greek', 
'cp874':'Thai', 
'cp875':'Greek', 
'cp932':'Japanese', 
'cp949':'Korean', 
'cp950':'Traditional Chinese', 
'cp1006':'Urdu', 
'cp1026':'Turkish', 
'cp1140':'Western Europe', 
'cp1250':'Central and Eastern Europe', 
'cp1251':'Bulgarian, Byelorussian, Macedonian, Russian, Serbian', 
'cp1252':'Western Europe', 
'cp1253':'Greek', 
'cp1254':'Turkish', 
'cp1255':'Hebrew', 
'cp1256':'Arabic', 
'cp1257':'Baltic languages', 
'cp1258':'Vietnamese', 
'euc_jp':'Japanese', 
'euc_jis_2004':'Japanese', 
'euc_jisx0213':'Japanese', 
'euc_kr':'Korean', 
'gb2312':'Simplified Chinese', 
'gbk 936':'Unified Chinese', 
'gb18030':'Unified Chinese', 
'hz hzgb':'Simplified Chinese', 
'iso2022_jp':'Japanese', 
'iso2022_jp_1':'Japanese', 
'iso2022_jp_2':'Japanese, Korean, Simplified ,Chinese, Western Europe, Greek', 
'iso2022_jp_2004':'Japanese', 
'iso2022_jp_3':'Japanese', 
'iso2022_jp_ext':'Japanese', 
'iso2022_kr':'Korean', 
'latin_1':'West Europe', 
'iso8859_2':'Central and Eastern Europe', 
'iso8859_3':'Esperanto, Maltese', 
'iso8859_4':'Baltic languages', 
'iso8859_5':'Bulgarian, Byelorussian, Macedonian, ,Russian, Serbian', 
'iso8859_6':'Arabic', 
'iso8859_7':'Greek', 
'iso8859_8':'Hebrew', 
'iso8859_9':'Turkish', 
'iso8859_10':'Nordic languages', 
'iso8859_13':'Baltic languages', 
'iso8859_14':'Celtic languages', 
'iso8859_15':'Western Europe', 
'iso8859_16':'South-Eastern Europe', 
'johab':'Korean', 
'koi8_r':'Russian', 
'koi8_u':'Ukrainian', 
'mac_cyrillic':'Bulgarian, Byelorussian, Macedonian, Russian, ,Serbian', 
'mac_greek':'Greek', 
'mac_iceland':'Icelandic', 
'mac_latin2':'Central and Eastern Europe', 
'mac_roman':'Western Europe', 
'mac_turkish':'Turkish', 
'ptcp154':'Kazakh', 
'shift_jis':'Japanese', 
'shift_jis_2004':'Japanese', 
'shift_jisx0213':'Japanese', 
'utf_32':'all languages', 
'utf_32_be':'all languages', 
'utf_32_le':'all languages', 
'utf_16':'all languages', 
'utf_16_be':'all languages (BMP only)', 
'utf_16_le':'all languages (BMP only)', 
'utf_7':'all languages', 
'utf_8':'all languages', 
'utf_8_sig':'all languages'}

來源

2012-07-04 09:02:45 fraxel

的Python：有沒有辦法從編解碼器名稱

回答

相關問題