2015-05-18 65 views
1

我正在創建一個PHP腳本在命令行上運行在一個典型的LAMP棧(L = OS X)上,並且遇到了很多麻煩特殊字符在數據庫中正確記錄。無法獲得UTF-8特殊字符正確寫入MySQL(PHP)

此腳本以遞歸方式掃描目錄並將完整路徑插入到MySQL數據庫表中。我已經做了大量的研究,如何讓特殊的字符寫入MySQL,但他們顯示爲?字符。

下面是代碼:

<?PHP 
ini_set('default_charset', 'UTF-8'); 


$link = mysql_connect('localhost', '--USER--', '--PASSWORD--'); 
mysql_set_charset('utf8',$link); 

if (!$link) { 
    die('Could not connect: ' . mysql_error()); 
} 

if(!mysql_select_db("files")) { 
    die('Could not connect: ' . mysql_error()); 
} 

mysql_query("SET NAMES utf8"); 
mysql_query("SET CHARACTER SET utf8"); 

function startsWith($haystack, $needle) { 
    return $needle === "" || strrpos($haystack, $needle, -strlen($haystack)) !== FALSE; 
} 

function getDirContents($dir, &$results = array()) { 
    $files = scandir($dir); 
    foreach($files as $key => $value) { 
      $path = realpath($dir.DIRECTORY_SEPARATOR.$value); 
      if(startsWith($path,'/Volumes/Macintosh HD/')) { 
        unset($files[$key]); 
      } else if(!is_dir($path) && !startsWith($value,'.') && startsWith($path,'/Volumes/')) { 
        $results[] = $path; 
        $query="INSERT IGNORE INTO files (path,dir) VALUES ('$path','0')"; 
        mysql_query($query); 
      } else if(is_dir($path) && !startsWith($value,'.') && startsWith($path,'/Volumes/')) { 
        getDirContents($path, $results); 
        $results[] = $path; 
        $query="INSERT IGNORE INTO files (path,dir) VALUES ('$path','1')"; 
        mysql_query($query); 
      } 
    } 
    return $results; 
} 


$directory='/Volumes'; 
$files=getDirContents($directory); 
sort($files); 
print_r($files); 

?> 

有問題的路徑是:

/Volumes/Mac Stadium Shuttle 1/DIG2008060702/files/Susan-Jürgen.dvdproj/Contents/PkgInfo 

通知在Jürgen變音符號炭。當腳本打印陣列中的所有文件時,ü正確顯示。

如果我添加一行到PHP腳本打印mysql_query(),返回以下:

INSERT IGNORE INTO files (path,dir) VALUES ('/Volumes/Mac Stadium Shuttle 1/DIG2008060702/files/Susan-Jürgen.dvdproj/Contents/PkgInfo','0') 

再次ü正確地顯示出來。

從MySQL命令行客戶端,我SELECT這個路徑:

mysql> select * from files where path like '%susan%'; 

...和響應:

+--------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+-----------+------+---------------+ 
| ID  | path                                         | dir | google_id | md5 | deleted_local | 
+--------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+-----------+------+---------------+ 
| 644990 | /Volumes/Mac Stadium Shuttle 1/DIG2008060702/files/Susan-Ju?rgen.dvdproj/Contents/PkgInfo                    | 0 | NULL  | NULL | 0    | 
+--------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+-----------+------+---------------+ 

...注意到于爾根的ü顯示爲u?( Ju?rgen)

我一直在努力確保:

  • 該php.ini中具有默認字符集爲UTF-8
  • 表的默認字符集是UTF8
  • 數據庫連接被定義爲連接UTF8

我添加phpinfo();附近的此腳本的頂部(在ini_set()之後)並從CLI運行它。 default_charset => UTF-8 => UTF-8顯示在響應中。

在腳本中連接數據庫後,我添加了echo mysql_client_encoding($link);並且腳本打印爲utf8

而且,我跑:

mysql> show variables like 'char%';

響應:

+--------------------------+--------------------------------------------------------+ 
| Variable_name   | Value             | 
+--------------------------+--------------------------------------------------------+ 
| character_set_client  | utf8             | 
| character_set_connection | utf8             | 
| character_set_database | utf8             | 
| character_set_filesystem | binary             | 
| character_set_results | utf8             | 
| character_set_server  | utf8             | 
| character_set_system  | utf8             | 
| character_sets_dir  | /usr/local/mysql-5.6.24-osx10.8-x86_64/share/charsets/ | 
+--------------------------+--------------------------------------------------------+ 
8 rows in set (0.05 sec) 

那麼,我究竟做錯了什麼?

編輯表的結構是:

mysql> DESCRIBE files; 
+---------------+------------------+------+-----+---------+----------------+ 
| Field   | Type    | Null | Key | Default | Extra   | 
+---------------+------------------+------+-----+---------+----------------+ 
| ID   | int(11) unsigned | NO | PRI | NULL | auto_increment | 
| path   | varchar(510)  | YES | UNI | NULL |    | 
| dir   | enum('0','1') | YES |  | 0  |    | 
| google_id  | varchar(255)  | YES |  | NULL |    | 
| md5   | varchar(255)  | YES |  | NULL |    | 
| deleted_local | enum('0','1') | YES |  | 0  |    | 
+---------------+------------------+------+-----+---------+----------------+ 
6 rows in set (0.00 sec) 

ANOTHER編輯:

mysql> show create table files; 
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 
| Table | Create Table                                                                                                                     | 
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 
| files | CREATE TABLE `files` (
     `ID` int(11) unsigned NOT NULL AUTO_INCREMENT, 
     `path` varchar(510) CHARACTER SET latin1 DEFAULT NULL, 
     `dir` enum('0','1') CHARACTER SET latin1 DEFAULT '0', 
     `google_id` varchar(255) CHARACTER SET latin1 DEFAULT NULL, 
     `md5` varchar(255) CHARACTER SET latin1 DEFAULT NULL, 
     `deleted_local` enum('0','1') CHARACTER SET latin1 DEFAULT '0', 
     PRIMARY KEY (`ID`), 
     UNIQUE KEY `path` (`path`) 
) ENGINE=InnoDB AUTO_INCREMENT=961879 DEFAULT CHARSET=utf8 | 
    +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 
    1 row in set (0.04 sec) 
+0

你可以顯示錶的結構嗎?它應該是這樣的'CREATE \'my_table \'(...)ENGINE = MyISAM DEFAULT CHARSET = utf8;' – Axalix

+0

@Axalix剛剛在上面的主要問題中添加了SHOW CREATE TABLE輸出......謝謝 – andrewniesen

+0

好吧,我想你應該分開這個問題;它可能是MySQL或PHP或兩者兼而有之。從MySQL開始:在MySQL控制檯中嘗試使用該符號進行插入,然後選擇。讓我們看看它是否有效。如果沒有,我們將首先關注MySQL。當我在我的機器上運行這個'插入到文件('路徑')值('Jürgen'); select * from files;'返回*Jürgen* – Axalix

回答

2

如在第二編輯所示,路徑列具有latin1的字符集,即使該表默認爲UTF8。也許你已經通過改變現有的桌子進入這個狀態?

試試你的數據庫表字段的ALTER TABLE files MODIFY path VARCHAR(510) CHARACTER SET utf8;

+0

latin1適用於*Jürgen* https://dev.mysql.com/doc/refman/5.0/en/charset-we-sets.html – Axalix

+0

這樣的詞。 'ALTER TABLE files MODIFY COLUMN path varchar(255)CHARACTER SET utf8 COLLATE utf8_unicode_ci UNIQUE;'奇怪的是,就像@Axalix提到的那樣,Latin1在Umlauts中可以正常工作。我想知道是不是因爲我在數據庫中的php.ini和Latin1中明確定義了UTF-8?我也有表的默認字符集(不是列)設置爲UTF8。 – andrewniesen

+0

@EwanMellor是的,我創建後在表上設置了默認字符集。 – andrewniesen

0

1.設置整理類型utf8_unicode_ci在meta標籤

2.change。

META HTTP-當量= 「內容類型」 CONTENT = 「text/html的;字符集= UTF-8」

  • 可以使用回波函數utf8_encode($值);在你的頁面中。