2016-09-26 44 views
1

我想只選擇包含某個屬性的特定行。這裏是我正在使用的數據樣本:選擇具有一定屬性的分組行

 
src_id            cand_source 
------            ----------- 
201609-004d7bgNDFXuIrQPXwsXrOptt2PdTdeXsjV5RJ6_mEQ mcp 
201609-004d7bgNDFXuIrQPXwsXrOptt2PdTdeXsjV5RJ6_mEQ mc2 
201609-00WmbmuIp3cwAcTNTbrgb9tTVR0AKNf-RvjXcHWPEEQ mc2 
201609-00WmbmuIp3cwAcTNTbrgb9tTVR0AKNf-RvjXcHWPEEQ mc2 
201609-00WmbmuIp3cwAcTNTbrgb9tTVR0AKNf-RvjXcHWPEEQ mc2 
201609-00WmbmuIp3cwAcTNTbrgb9tTVR0AKNf-RvjXcHWPEEQ mc2 
201609-00WmbmuIp3cwAcTNTbrgb9tTVR0AKNf-RvjXcHWPEEQ mc2 
201609-00WmbmuIp3cwAcTNTbrgb9tTVR0AKNf-RvjXcHWPEEQ mc2 
201609-01My_orS795Hmomry3-JiCiBVimarRzRGQ9Cnornp8Q mcp 
201609-01My_orS795Hmomry3-JiCiBVimarRzRGQ9Cnornp8Q mcp 
201609-01My_orS795Hmomry3-JiCiBVimarRzRGQ9Cnornp8Q mc2 
201609-01My_orS795Hmomry3-JiCiBVimarRzRGQ9Cnornp8Q mcp 
201609-01My_orS795Hmomry3-JiCiBVimarRzRGQ9Cnornp8Q mc2 
201609-01noPFGBCqbH9jUB9MHNqPynjqW8cr24LJY917vSGTs mc2 
201609-01noPFGBCqbH9jUB9MHNqPynjqW8cr24LJY917vSGTs mc2 
201609-02ISoPEX0VVkQ0ogot49Q-e7K39Zyk2vdN1rB4Q-kl0 mc2 
201609-02ISoPEX0VVkQ0ogot49Q-e7K39Zyk2vdN1rB4Q-kl0 mc2 
201609-02LVZ8UqAaz7JCp3RAOTiIE7zH2mveiSQPBo6I6dHDc mc2 
201609-02LVZ8UqAaz7JCp3RAOTiIE7zH2mveiSQPBo6I6dHDc mc2 
201609-03dLH32kaKYVwIj4HiT1tZjCNgqgXiG-fvezX3S9QI4 mc2 
201609-03dLH32kaKYVwIj4HiT1tZjCNgqgXiG-fvezX3S9QI4 mc2 
201609-0421Jatpsk9T8GOD1M_GvDrnyV4dA41IL5tDeuTxGwU mc2 
201609-0421Jatpsk9T8GOD1M_GvDrnyV4dA41IL5tDeuTxGwU mc2 
201609-0421Jatpsk9T8GOD1M_GvDrnyV4dA41IL5tDeuTxGwU mc2 
201609-0421Jatpsk9T8GOD1M_GvDrnyV4dA41IL5tDeuTxGwU mc2 
201609-0421Jatpsk9T8GOD1M_GvDrnyV4dA41IL5tDeuTxGwU mc2 
201609-0421Jatpsk9T8GOD1M_GvDrnyV4dA41IL5tDeuTxGwU mc2 
201609-0421Jatpsk9T8GOD1M_GvDrnyV4dA41IL5tDeuTxGwU mc2 
201609-0421Jatpsk9T8GOD1M_GvDrnyV4dA41IL5tDeuTxGwU mc2 
201609-0421Jatpsk9T8GOD1M_GvDrnyV4dA41IL5tDeuTxGwU mc2 
201609-04HzM6NBIx_6QN91xzF9_p0RGfAQcRMeEhVFEPFZ8p4 mcp 
201609-04HzM6NBIx_6QN91xzF9_p0RGfAQcRMeEhVFEPFZ8p4 mc2 
201609-04HzM6NBIx_6QN91xzF9_p0RGfAQcRMeEhVFEPFZ8p4 mc2 
201609-04HzM6NBIx_6QN91xzF9_p0RGfAQcRMeEhVFEPFZ8p4 mc2 
201609-04HzM6NBIx_6QN91xzF9_p0RGfAQcRMeEhVFEPFZ8p4 mc2 
201609-04HzM6NBIx_6QN91xzF9_p0RGfAQcRMeEhVFEPFZ8p4 mc2 
201609-04HzM6NBIx_6QN91xzF9_p0RGfAQcRMeEhVFEPFZ8p4 mc2 
201609-04HzM6NBIx_6QN91xzF9_p0RGfAQcRMeEhVFEPFZ8p4 mc2 
201609-04HzM6NBIx_6QN91xzF9_p0RGfAQcRMeEhVFEPFZ8p4 mc2 
201609-04JzR3AMxsfQvAeq1MAgjCtMhcaqt2Z_WNmuUlYLrLM mc2 
201609-04JzR3AMxsfQvAeq1MAgjCtMhcaqt2Z_WNmuUlYLrLM mcp 

我想要做的是隻選擇具有至少一個cand_source等於mcpsrc_id秒。以下是我已經試過:

SELECT * 
FROM schema.table 
WHERE src_id IN (
    SELECT src_id 
    FROM schema.table 
    WHERE batch_id = ? 
    GROUP BY src_id 
    HAVING count(cand_source = 'mcp') > 1 
) 
ORDER BY src_id, 
    match_score DESC 

然而,這一直給我回的src_id S作任何cand_source小號相當於mcp集羣。


有人指出,我只是過分複雜的事情。這裏的解決方案:

SELECT * 
FROM schema.table 
WHERE src_id IN (
    SELECT DISTINCT src_id 
    FROM schema.table 
    WHERE batch_id = ? 
     AND cand_source = 'mcp' 
) 
ORDER BY src_id, 
    match_score DESC 

回答

1

如果你只是想有MCP然後用WHERE子句的直查詢src_id就足夠了,不需要有條件的聚集或任何東西。

SELECT DISTINCT 
    src_id 
FROM 
    Table 
WHERE 
    cand_source = 'mcp' 
    AND batch_id = ? 

如果你希望所有的記錄有至少1 cand_source每個src_id你就可以加入該回表接收所有的記錄。

SELECT t.* 
FROM 
    Table t 
INNER JOIN 
    (SELECT DISTINCT src_id 
    FROM Table 
    WHERE cand_source = 'mcp' 
     AND batch_id = ?) d ON t.src_id = d.src_id 
          AND t.batch_id = ? 

或者你可以使用公共表表達式與真棒窗口函數來做到這一點。

WITH cte AS 
(
    SELECT *, COUNT(CASE WHEN cand_source = 'mcp' THEN cand_source END) OVER (PARTITION BY src_id) as McpCount 
    FROM 
     Table 
    WHERE 
     batch_id = ? 

) 
SELECT * 
FROM 
    cte 
WHERE 
    McpCount > 0; 
+1

doh!我過於複雜的事情。感謝爲我簡化了事情。 – liltitus27

1

如果你只想要源代碼ids,那麼你的子查詢就是你所需要的。但是,您想要計算匹配值的數量。下面是詳細的邏輯:

SELECT src_id 
FROM schema.table 
WHERE batch_id = ? 
GROUP BY src_id 
HAVING SUM(case when cand_source = 'mcp' then 1 else 0 end) > 1 

更簡潔版本是:

HAVING SUM(cand_source = 'mcp'::int) > 1 
+0

與我的'HAVING count(cand_source ='mcp')> 1'不一樣嗎? – liltitus27

+1

@ liltitus27。 。 。不可以。您的計數與count(cand_source)'相同。比較無關緊要,因爲'count()'計算非'NULL'值的數量。 –