2014-07-08 75 views
2

我目前正在使用mysql LOAD DATA INFILE將csv值插入名爲test的表中。當事情變得更加複雜時,到目前爲止所有的東西都非常好。我有另一個表occupations其中包含occupation_id我正在使用test表中的外鍵。原始csv文件僅帶有以下字段First Name,Last Name,Age,Date Of BirthOccupation(請參閱下面的示例值)。我想根據csv文本字段Occupation來計算occupation_id。這怎麼可能?在csv file表字段值和外鍵 - LOAD DATA INFILE

列標題各自的價值

+------------+-----------+-----+---------------+------------+ 
| First Name | Last Name | Age | Date of Birth | Occupation | 
+------------+-----------+-----+---------------+------------+ 
| Lionel  | Messi  | 27 | 6/24/1987  | Soccer  | 
| Michael | Jordan | 51 | 2/17/1963  | Basketball | 
| Lebron  | James  | 30 | 12/30/1984 | Actor  | 
+------------+-----------+-----+---------------+------------+ 

occupation

+---------------+-----------------+ 
| occupation_id | occupation_name | 
+---------------+-----------------+ 
|    1 | Basketball  | 
|    2 | Soccer   | 
|    3 | Actor   | 
+---------------+-----------------+ 

結果CSV後插入到表test

+------------+-----------+-----+-------------+---------------+-----------------+ 
| first_name | last_name | age | dob  | occupation_id | occupation_name | 
+------------+-----------+-----+-------------+---------------+-----------------+ 
| Lionel  | Messi  | 27 | 1987-06-24 |    2 | Soccer   | 
| Michael | Jordan | 51 | 1963-02-17 |    1 | Basketball  | 
| Lebron  | James  | 30 | 1984-30-12 |    3 | Actor   | 
+------------+-----------+-----+-------------+---------------+-----------------+ 

PHP/SQL - 我的查詢到目前爲止

$db_insert = $db_con->prepare("LOAD DATA LOCAL INFILE '".$filename."' 
    INTO TABLE test FIELDS TERMINATED BY ',' 
    OPTIONALLY ENCLOSED BY '\"' 
    LINES TERMINATED BY '\r\n' 
    IGNORE 1 LINES 
    (@column1, @column2, @column3, @column4, @column5) 
    SET [email protected], [email protected], [email protected], dob = STR_TO_DATE(@column4, '%m/%d/%Y'), [email protected] 
"); 
$db_insert->execute(); 

回答

1

我不會在LOAD DATA聲明中試圖做到這一點。從理論上講,您可以在LOAD DATA聲明中執行子查詢來查找相應的occupation_id,但即使可以,也會損害批量加載的性能。

下面是它會怎麼看,但我希望表現是可怕的,如果你不是行的一個微不足道的數目裝載更多:

LOAD DATA LOCAL INFILE 't.csv' 
INTO TABLE test FIELDS TERMINATED BY ',' 
OPTIONALLY ENCLOSED BY '\"' 
LINES TERMINATED BY '\r\n' 
IGNORE 1 LINES 
(@column1, @column2, @column3, @column4, @column5) 
SET [email protected], [email protected], [email protected], 
    dob = STR_TO_DATE(@column4, '%m/%d/%Y'), [email protected], 
    occupation_id=(SELECT occupation_id FROM occupation WHERE [email protected] LIMIT 1); 

相反,我會做LOAD DATA和離開occupation_id空。然後LOAD DATA完成後,運行UPDATE加入到其他表:

UPDATE test JOIN occupation ON test.occupation = occupation.occupation_name 
SET test.occupation_id = occupation.occupation_id; 
+0

+1這是一個很好的方法。如果在'LOAD DATA'中完成所有操作,我會得到以下錯誤:'SQLSTATE [42S22]:未找到列:1054'字段lis'中的未知列'occupation_name' –

+0

我編輯了上述希望匹配列名上市。 –

0

首先,我會幹掉領域的test.occupation_name

然後,您可以在兩個步驟來做到:

  1. 按原樣將LOAD DATA加載到類似csv結構的表中。讓我們把它叫做test_csv,並使用與那些test
  2. 兼容的字段名執行以下操作:

INSERT INTO test 
SELECT tc.first_name, tc.last_name, tc.age, tc.dob, o.occupation_id 
FROM test_csv tc 
JOIN occupation o ON (tc.occupation_name=o.occupation_name) 

您將結束表test引用職業表occupations

希望這有助於。