2016-10-31 107 views
0

所以我有幾張表格,我想看看如何將其中一列與其餘列的相關性作爲表格中的列來呈現。postgresql在列上顯示的相關性

例如,假設我有2個表:

  • 擊球(重點:團隊,battingAverage,猛擊等)
  • 記錄(關鍵:團隊,勝,負)

如果我想要一個表的輸出是

attribute  | correlation 
battingAverage | .025 
slugging  | .005 
... 

我將如何去實現這一目標?我知道我可以使用CORR函數來查找兩列之間的相關性,但我對將其應用於整列的方式感到困惑,以及如何將該列及其相關性顯示在一行中。

現在我試圖在使用(值(),(),...)硬編碼它,但我得到一個錯誤,說我的子查詢返回多行作爲表達式使用時,但「SELECT」只在我的查詢中出現一次,我也看不到任何表達式。

這裏是我模擬了查詢,現在(我的項目無關棒球,但我做了這個例如緣故)

SELECT attributes.attribute, (values 
    (CORR(Record.wins,Batting.BattingAverage)), 
    (CORR(Record.wins,Batting.slugging)), 
    (CORR(Record.wins,batting.OBP)), 
    (CORR(Record.wins,batting.HomeRuns))) 
AS correlation 
FROM Batting LEFT JOIN Record ON Batting.Team = Record.Team,(values 
    ('Batting Average'), 
    ('Slugging'), 
    ('OBP'), 
    ('Home Runs')) attributes(attribute) 
GROUP BY attributes.attribute; 

回答

0

如果你想每列一排,你必須以某種方式產生這些。你正在嘗試用交叉連接(如果你把它稱爲CROSS JOIN而不是,,它會稍微更具可讀性)。但是你沒有將select子句與期望的屬性聯繫起來。

SELECT 
    attributes.attribute, 
    CORR(Record.wins, 
     case attributes.attribute 
     when 'Batting Average' then Batting.BattingAverage 
     when 'Slugging' then Batting.slugging 
     when 'OBP' then Batting.OBP 
     when 'Home Runs' then Batting.HomeRuns 
     end 
    ) AS correlation 
FROM Batting 
JOIN Record ON Batting.Team = Record.Team 
CROSS JOIN (values 
    ('Batting Average'), 
    ('Slugging'), 
    ('OBP'), 
    ('Home Runs')) attributes(attribute) 
GROUP BY attributes.attribute; 

但是我不太確定是否構建笛卡爾產品影響了更正。我認爲這並不是因爲所有的行都以相同的因子倍增,但我不熟悉統計和相關性計算。

我寧願把它簡單,安全:

SELECT 'Batting Average' AS attribute, CORR(r.wins, b.BattingAverage) AS correlation 
    FROM Batting b JOIN Record r ON b.Team = r.Team 
UNION ALL 
SELECT 'Slugging' AS attribute, CORR(r.wins, b.slugging) AS correlation 
    FROM Batting b JOIN Record r ON b.Team = r.Team 
UNION ALL 
SELECT 'OBP' AS attribute, CORR(r.wins, b.OBP) AS correlation 
    FROM Batting b JOIN Record r ON b.Team = r.Team 
UNION ALL 
SELECT 'Home Runs' AS attribute, CORR(r.wins, b.HomeRuns) AS correlation 
    FROM Batting b JOIN Record r ON b.Team = r.Team 

一個簡單的替代應該是UNPIVOT。然而,我從來沒有使用它,但應該很容易查找語法。我想UNPIVOT實際上是最合適的解決方案。

+0

第二個代碼塊非常直接,因爲沒有那麼多的屬性,所以我沒有考慮過每次只生成一行。謝謝! – user3311613