Alexander指出我的主要錯誤(你需要encode
然後force_encoding
找到正確的編碼)。字符串的確編碼爲CP1252!
最好的辦法是從MySQL讀取二進制,然後強制編碼:
client = Mysql2::Client.new(opts.merge encoding: 'binary')
# ...
text.force_encoding('UTF-8')
或者,如果你無法改變你如何獲取數據,你會用Encoding::UndefinedConversionError
時被卡住你試試encode
。如圖this blog post詳述的,該解決方案是爲five undefined CP1252字節指定編碼:
fallback = {
"\u0081" => "\x81".force_encoding("CP1252"),
"\u008D" => "\x8D".force_encoding("CP1252"),
"\u008F" => "\x8F".force_encoding("CP1252"),
"\u0090" => "\x90".force_encoding("CP1252"),
"\u009D" => "\x9D".force_encoding("CP1252")
}
text.encode('CP1252', fallback: fallback).force_encoding('UTF-8')
來源
2014-10-27 22:02:27
Max
使用'Encoding.list'或'Encoding.name_list'代替'Encoding.constants'。 – Stefan 2014-10-27 21:58:38