2012-03-02 80 views
1

MySQL的新嘗試從R.重新格式化(重塑)數據表從長寬使用MySQL

我有兩列,類似於下面的數據表上攜帶的東西,有2級至ID和嵌套ID:

level2id | nestedid | 
1  | 1  | 
1  | 2  | 
1  | 3  | 
2  | 1  | 
2  | 2  | 
... 

我想重組一個新的表像這樣使用MySQL的數據:

level2id | nestedid1 | nestedid2 | nestedid3 | 
1  | 1   | 2   | 3   | 
2  | 1   | 2   |   | 
... 

這是這樣,我以後可以進行加入提取的嵌套信息id爲與level2 id相關的變量創建聚合值。在R中使用重塑來處理「時變」數據是一件小事,但卻無法找到這種特殊格式的明顯解決方案(即數據沒有按列中的屬性名稱和屬性值進行組織。

+0

'我想重組使用MySQL數據在這樣的新表中:'---這是一個非常糟糕的主意。這樣做的最初原因是什麼? – zerkms 2012-03-02 03:15:21

+2

我不認爲在MySQL中這很容易 - 如果你已經在使用R,我建議在R中進行重塑。你可以試試['sqldf'](http://code.google.com/p/sqldf /)包來對數據幀執行類似於SQL的查詢。 – 2012-03-02 03:17:23

+0

做這種事情有很多原因。就我而言,我需要收集有關個人的信息(nestedid)並彙總家庭級別(級別2)的信息。不過,我不只是想要一堆交叉標籤,因爲nestedids之間的特定關係非常重要。 – SMM 2012-03-02 03:31:32

回答

0

雖然你不能做到這一點作爲一個選擇可以在此使用插入其中只有主鍵LS level2id工作達到或你對此level2id唯一索引

表結構

CREATE TABLE `table2` (
    `level2id` int(11) NOT NULL DEFAULT '0', 
    `nestedid1` int(11) NOT NULL, 
    `nestedid2` int(11) NOT NULL, 
    `nestedid3` int(11) NOT NULL, 
    PRIMARY KEY (`level2id`) 
) ENGINE=InnoDB; 

的插入SQL語句將table1替換爲舊錶

INSERT INTO table2 (level2id, nestedid1) SELECT level2id, nestedid FROM table1 WHERE nestedid = 1 ON DUPLICATE KEY UPDATE nestedid1 = nestedid; 
INSERT INTO table2 (level2id, nestedid2) SELECT level2id, nestedid FROM table1 WHERE nestedid = 2 ON DUPLICATE KEY UPDATE nestedid2 = nestedid; 
INSERT INTO table2 (level2id, nestedid3) SELECT level2id, nestedid FROM table1 WHERE nestedid = 3 ON DUPLICATE KEY UPDATE nestedid3 = nestedid; 

的對重複密鑰更新是這裏的MySQL擴展更多的細節http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html

0

您可以使用MySQL創建MySQL程序,將解決這個問題:

USE test; 

/*Create long input table 'test' with variables of varying length*/ 
DROP TABLE nums; 
CREATE TABLE nums (id INT(2)); 
INSERT INTO nums 
VALUES 
(0), (1), (2), (3), (4), (5), (6), (7); 

DROP TABLE test; 
CREATE TABLE test (id INT(2), var VARCHAR(5), attribute VARCHAR(6), PRIMARY KEY (id, var)); 
INSERT INTO test 
SELECT nums3.*, REPEAT(CHAR(97+RAND()*24),CAST(6.*RAND() AS INT)) AS attribute 
FROM (SELECT DISTINCT nums2.id1 as id, CONCAT('var', LPAD(CAST(16.*RAND() AS INT),2,'0')) AS var 
FROM (SELECT DISTINCT nums.id as id1, nums1.id as id2 FROM nums, nums as nums1) AS nums2) AS nums3; 

/*Create SQL program to convert long to wide format (R: reshape)*/ 
SELECT DISTINCT CONCAT('DROP TABLE result;\nCREATE TABLE result (id INT(2), 
', GROUP_CONCAT(CONCAT(field) SEPARATOR ', '), ');') 
FROM 
(SELECT DISTINCT CONCAT(var, CONCAT(' VARCHAR(', max(length(attribute)), ')')) AS field 
FROM test GROUP BY var) AS fields 

UNION 

SELECT CONCAT("INSERT INTO result \nSELECT DISTINCT test.id, ", GROUP_CONCAT(var SEPARATOR '.attribute, '), 
".attribute FROM (SELECT DISTINCT id FROM test) AS test") 
FROM (SELECT DISTINCT var FROM test ORDER BY var) as vars 

UNION 

SELECT CONCAT("LEFT JOIN test AS ", var, " ON test.id = ", var, ".id AND ", var, ".var=", '"', var, '"') 
FROM (SELECT DISTINCT var FROM test ORDER BY var) as vars 

UNION 

SELECT ";" ; 

/*Copy output to screen editor, delete '|' symbols and superfluous white spaces. 
Then copy to MySQL prompt, run by pressing 'enter' key and view 'result'*/