2013-10-07 67 views
0

我將波斯維基百科2007的轉儲文件導入到本地mysql 5.6。看來非拉丁腳本中的用戶名不能正確保存。有沒有辦法解決?維基百科轉儲文件

select DISTINCT rev_user_text from revision where rev_user_text like '%?%'; 

+-------------------------------+ 
| rev_user_text     | 
+-------------------------------+ 
| 1?1?       | 
| ?        | 
| ? ?       | 
| ? ? ?       | 
| ? ????      | 
| ?. ?????????     | 
| ?.????      | 
| ?.???????      | 
| ?.????????     | 
| ??       | 
| ?? ??       | 
| ?? ?? ??      | 
| ?? ???      | 
| ?? ??? ???     | 
| ???       | 
| ??? 110      | 
| ??? ?       | 
| ??? ???      | 
| ??? ??? (?? ???)   | 
| ??? ??? ????? ???    | 
| ??? ????      | 
| ??? ???? ???     | 
| ??? ???? ?????    | 
| ??? ???? ???????    | 
| ??? ?????      | 
| ??? ????? ???     | 
| ??? ????? ????    | 
| ??? ????? ??????    | 
| ??? ?????1984     | 
| ??? ??????     | 
| ??? ???????     | 
| ??? ??????? ???    | 
| ??? ????????     | 
| ??? ??????????    | 
| ???76       | 
| ????       | 
| ???? 32      | 
| ???? ?      | 
| ???? ??      | 
| ???? ?? ? ?????    | 
| ???? ???      | 
| ???? ??? ? ????? ????   | 
| ???? ??? ????     | 
| ???? ??? ?????    | 
| ???? ??? ????? ?????   | 
| ???? ????      | 
| ???? ???? ???     | 
| ???? ???? ??? (??????)  | 
| ???? ???? ????    | 
| ????.???      | 
| ????22      | 
| ????4183      | 
| ????777      | 
| ????808      | 
| ?????       | 
| ????? - ???? ???    | 
| ????? 85 8     | 
| ????? ?      | 
| ????? ???      | 
| ????? ??? ???     | 
| ????? ??? ????    | 
| ????? ????     | 
| ????? ???? (????? ????)  | 
| ????? ???? --????? ????  | 
| ????? ???? -????? ????  | 
| ????? ???? ???    | 
| ????? ???? ????    | 
| ????? ???? ??????    | 
| ????? ?????     | 
| ????? ????? ????    | 
| ????? ????? ?????    | 
| ????? ????? ????????   | 
| ????? ??????     | 
| ????? ?????? ???    | 
……. 

回答

1

也許你沒有使用合適的字符集,如utf8。 嘗試使用重建表:

CREATE TABLE revisions 
(...) 
CHARACTER SET 'utf8'; 

或改變字符集現有的表:

ALTER TABLE revisions 
CHARACTER SET 'utf8';