我發現了postgres的crosstab函數中一個奇怪的行爲,我無法解釋,但希望其他人可能...爲什麼空日期導致交叉表功能失敗?
我使用的交叉表函數的版本需要先構建一個初步表。
此SQL成功地創建了初步表:
SELECT
ST.studyabrv||' '||S.labid||' '||S.subjectid||' '||S.box::varchar||' '||S.well AS "rowname",
M.marker AS "bucket",
G.allele1||' '||G.allele2 AS "bucket_value"
INTO TABLE ct
FROM
geno.gmarkers M,
geno.genotypes G,
geno.gsamples S,
geno.guploads U,
geno.gibg_studies ST
WHERE
G.markers_id=M.id
AND G.gsamples_id=S.id
AND S.guploads_id=U.id
AND U.ibg_study_id=ST.id
AND (M.id=5 OR M.id=6 OR M.id=2 OR M.id=4 OR M.id=3)
AND (S.labid='CL100001' OR S.labid='CL100002' OR S.labid='CL100003' OR S.labid='CL100004' OR S.labid='CL100005' OR S.labid='CL100006' OR S.labid='CL100007' OR S.labid='CL100008' OR S.labid='CL100009' OR S.labid='CL100010' OR S.labid='CL100011' OR S.labid='CL100012' OR S.labid='CL100013' OR S.labid='CL100014' OR S.labid='CL100015')
ORDER BY box,well;
,類似的產生輸出:
rowname | bucket | bucket_value
--------------------------+-----------+--------------
LTS CL100001 10011 1 A01 | 5HTTLPR-T | S La
LTS CL100001 10011 1 A01 | 5HTTLPR-D | 14 16
LTS CL100001 10011 1 A01 | DAT1 | 440 480
LTS CL100001 10011 1 A01 | DRD4 | 475 475
LTS CL100001 10011 1 A01 | Caspi | 351 351
LTS CL100009 10420 1 A02 | Caspi |
LTS CL100009 10420 1 A02 | 5HTTLPR-T | La Lg
LTS CL100009 10420 1 A02 | 5HTTLPR-D | 16 16
LTS CL100009 10420 1 A02 | DAT1 | 440 480
LTS CL100009 10420 1 A02 | DRD4 | 475 475
...
但是,如果我嘗試包括日期欄,裏面全是空,如下所示:
SELECT
ST.studyabrv||' '||S.labid||' '||S.subjectid||' '||S.box::varchar||' '||S.well||' '||G.run_date::text AS "rowname",
M.marker AS "bucket",
G.allele1||' '||G.allele2 AS "bucket_value"
INTO TABLE ct
FROM
geno.gmarkers M,
geno.genotypes G,
geno.gsamples S,
geno.guploads U,
geno.gibg_studies ST
WHERE
G.markers_id=M.id
AND G.gsamples_id=S.id
AND S.guploads_id=U.id
AND U.ibg_study_id=ST.id
AND (M.id=5 OR M.id=6 OR M.id=2 OR M.id=4 OR M.id=3)
AND (S.labid='CL100001' OR S.labid='CL100002' OR S.labid='CL100003' OR S.labid='CL100004' OR S.labid='CL100005' OR S.labid='CL100006' OR S.labid='CL100007' OR S.labid='CL100008' OR S.labid='CL100009' OR S.labid='CL100010' OR S.labid='CL100011' OR S.labid='CL100012' OR S.labid='CL100013' OR S.labid='CL100014' OR S.labid='CL100015')
ORDER BY box,well;
這產生輸出:
rowname | bucket | bucket_value
---------+-----------+--------------
| 5HTTLPR-T | S La
| 5HTTLPR-D | 14 16
| DAT1 | 440 480
| DRD4 | 475 475
| Caspi | 351 351
| Caspi |
| 5HTTLPR-T | La Lg
| 5HTTLPR-D | 16 16
正如你所看到的,將run_date列添加到「rowname」組合列的末尾會呈現整個組合空白......這是瘋狂的。 如果我使用虛擬數據填充run_date,它將顯示....但是如果它爲空或空,這會導致「rowname」變爲空白。
我不能說這是否是postgres中的錯誤,但如果可能的話,這是一個奇怪的結果,我想解決。
TIA, rixter
謝謝!將檢查合併功能。 – rixter 2012-04-04 19:26:29
歡迎!只是不要爲postgresql提出這樣的錯誤:P – 2012-04-04 19:27:49
這個工作,使用用戶同意的虛擬值:|| coalesce(G.run_date,'01/01/1900'):: text – rixter 2012-04-04 19:33:19