我將一個Sinatra應用程序從SQLite3切換到MySQL。出於某種原因,我不明白,當我使用Ruby和Sequel從MySQL中提取數據時,字符以8位ASCII而不是UTF-8出現。Ruby和MySQL UTF-8字符
部署環境是FreeBSD 9.1和MySQL 5.6.12,從FreeBSD端口安裝了系統範圍的ruby19。 RVM ruby-2.0p247產生相同的結果。
我my.cnf
如下:
# The following options will be passed to all MySQL clients
[client]
default-character-set=utf8
#password = your_password
port = 3306
socket = /tmp/mysql.sock
# Here follows entries for some specific programs
# The MySQL server
[mysqld]
port = 3306
socket = /tmp/mysql.sock
skip-external-locking
key_buffer_size = 128M
max_allowed_packet = 1M
table_open_cache = 256
sort_buffer_size = 1M
read_buffer_size = 1M
read_rnd_buffer_size = 2M
myisam_sort_buffer_size = 32M
thread_cache_size = 4
query_cache_size= 8M
# Try number of CPU's*2 for thread_concurrency
thread_concurrency = 2
# encoding issues
character-set-server=utf8
collation-server=utf8_general_ci
log-bin=mysql-bin
binlog_format=mixed
server-id = 1
[mysqldump]
quick
max_allowed_packet = 16M
[mysql]
no-auto-rehash
safe-updates
[myisamchk]
key_buffer_size = 64M
sort_buffer_size = 64M
read_buffer = 1M
write_buffer = 1M
[mysqlhotcopy]
interactive-timeout
我所有的文件使用的家當符合UTF-8編碼類似這樣的腳本,我用它來測試條目一起:
#!/usr/bin/env ruby
# encoding: UTF-8
require 'sequel'
msql = Sequel.connect(adapter: 'mysql', host: 'localhost', database: 'metrosignage', user: 'atma', password: 'toola697', encoding: 'utf8')
b = msql[:drama_addressbook]
b.each do |entry|
p entry
# p entry[:city].force_encoding("utf-8")
end
如果我使用entry[:city].force_encoding("utf-8")
輸出正確,希臘UTF-8字符顯示正常。但我不明白爲什麼我不能直接提取UTF-8。
我讀使用下面的SQL創建數據表:
CREATE TABLE `drama_addressbook` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
`address` varchar(255) DEFAULT NULL,
`address_no` int(11) DEFAULT NULL,
`address_description` varchar(255) DEFAULT NULL,
`phone` varchar(255) DEFAULT NULL,
`city` varchar(255) DEFAULT NULL,
`country` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=30 DEFAULT CHARSET=utf8;
因此數據庫是UTF-8和數據是UTF-8。我的問題是:
- 我做錯了什麼?
- 爲什麼Ruby需要
force_encoding
?
你是否創建了這樣的數據庫? 'CREATE DATABASE metrosignage DEFAULT CHARSET utf8;' –
[續集可能不會返回utf-8,只是ascii-8bit](http://stackoverflow.com/questions/14070281/sequel-never-returns-utf-8-just -ascii-8bit) –