2014-08-28 19 views
3

我有一個MySQL數據庫。我將charset設置爲utf8;Python:MySQLdb庫編碼問題

... 
    PRIMARY KEY (`username`) 
) ENGINE=MyISAM DEFAULT CHARSET=utf8 | 
... 

我使用MySQLdb在python中連接到db;

conn = MySQLdb.connect(host = "localhost", 
           passwd = "12345", 
           db = "db", 
           charset = 'utf8', 
           use_unicode=True) 

當我執行查詢時,響應正在使用「windows-1254」進行解碼。響應示例;

curr = conn.cursor(MySQLdb.cursors.DictCursor) 
select_query = 'SELECT * FROM users' 
curr.execute(select_query) 

for ret in curr.fetchall(): 
    username = ret["username"] 
    print "repr-username; ", repr(username) 
    print "username; "username.encode("utf-8") 
... 

輸出是;

repr-username; u'\xc5\u0178\xc3\xbckr\xc3\xbc\xc3\xa7a\xc4\u0178l\xc3\xbcli' 
username; şükrüçağlüli 

當我用「windows-1254」打印用戶名時,它工作正常;

... 
print "repr-username; ", repr(username) 
print "username; ", username.encode("windows-1254") 
... 

輸出是;

repl-username; u'\xc5\u0178\xc3\xbckr\xc3\xbc\xc3\xa7a\xc4\u0178l\xc3\xbcli' 
username; şükrüçağlüli 

當我用西裏爾字母這樣的其他字符嘗試它時,解碼會發生變化。我怎樣才能防止它?

+0

是明確的, 「şükrüçağlüli」 是你想要的輸出? – Optox 2014-08-28 12:55:14

+0

是的。這個文本有一些土耳其特殊字符,如「şüçğ」。 – umut 2014-08-28 12:57:18

+0

這也是表的字符集? – Korem 2014-08-28 13:08:27

回答

3

我覺得INSERT到數據庫時編碼錯誤的項目。

我推薦的python-ftfy(從https://github.com/LuminosoInsight/python-ftfy)(幫了我一個呈三角問題):

import ftfy 

username = u'\xc5\u0178\xc3\xbckr\xc3\xbc\xc3\xa7a\xc4\u0178l\xc3\xbcli' 
print ftfy.fix_text(username) # outputs şükrüçağlüli