2010-03-05 48 views
4
+--------------------------+--------------------------------------------------------+ 
| Variable_name   | Value             | 
+--------------------------+--------------------------------------------------------+ 
| character_set_client  | utf8             | 
| character_set_connection | utf8             | 
| character_set_database | utf8             | 
| character_set_filesystem | binary             | 
| character_set_results | utf8             | 
| character_set_server  | utf8             | 
| character_set_system  | utf8             | 
| character_sets_dir  | /usr/local/mysql-5.1.41-osx10.5-x86_64/share/charsets/ | 
+--------------------------+--------------------------------------------------------+ 
8 rows in set (0.00 sec) 

mysql> select version(); 
+-----------+ 
| version() | 
+-----------+ 
| 5.1.41 | 
+-----------+ 
1 row in set (0.00 sec) 

mysql> select char(0x00FC); 
+--------------+ 
| char(0x00FC) | 
+--------------+ 
| ?   | 
+--------------+ 
1 row in set (0.00 sec)

期待實際的utf8字符 - >「ü」而不是「?」嘗試char(0x00FC使用utf8)也是,但沒有去。MySQL CHAR()函數和UTF8輸出?

使用MySQL版本5.1.41

去過印花布谷歌,找不到這樣的東西。 MySQL文檔簡單地說,在MySQL版本5.0.14之後,多字節輸出預計值大於255。

感謝

+0

你的shell使用什麼字符集? – thetaiko 2010-03-05 03:26:21

回答

7

你混淆了UTF-8使用Unicode。

0x00FC爲ü的的Unicode代碼點:

mysql> select char(0x00FC using ucs2); 
+----------------------+ 
| char(0x00FC using ucs2) | 
+----------------------+ 
| ü     | 
+----------------------+ 

UTF-8編碼,0x00FC is represented by two bytes

mysql> select char(0xC3BC using utf8); 
+-------------------------+ 
| char(0xC3BC using utf8) | 
+-------------------------+ 
| ü      | 
+-------------------------+ 

UTF-8是僅僅編碼的方式以二進制形式的Unicode字符。這意味着節省空間,這就是ASCII字符只佔用一個字節的原因,而諸如ü的iso-8859-1字符只佔用兩個字節。其他一些字符需要三個或四個字節,但它們不太常見。

+0

謝謝 - 非常有幫助。 – jason 2010-03-05 05:10:17