2017-02-13 32 views
2

我有一個數據:(SQL)如何爲每個組選擇正確的行?

+------------+-----------+-----------+------------+--------------+ 
| first_name | last_name | family_id | is_primary | is_secondary | 
+------------+-----------+-----------+------------+--------------+ 
| a   | b   |   1 |   1 |   0 | 
| aa   | bb  |   1 |   0 |   0 | 
| c   | d   |   1 |   0 |   0 | 
| cc   | dd  |   1 |   0 |   0 | 
| e   | f   |  10 |   0 |   0 | 
| e   | f   |  10 |   0 |   1 | 
| gg   | hh  |  10 |   0 |   1 | 
| gg   | hh  |  10 |   0 |   0 | 
| gg   | hh  |  10 |   0 |   0 | 
| gg   | hh  |  10 |   0 |   0 | 
+------------+-----------+-----------+------------+--------------+ 

我想要做的是:

  • 集團通過family_id(因此,我們將有兩個團)
  • 對於每個組,如果有一些行有is_primary等於1,然後選擇它們的一個隨機行,並獲取它的first_name和last_name作爲組的兩列的輸出
  • 對於每個組,如果沒有行的is_primary等於1,找到一個行(任何行是確定),其具有is_secondary等於1,並得到它的如first_name和last_name作爲該組的兩個列的輸出

因此,基於上面描述的邏輯和數據,正確結果應該是:

+-----------+------------+-----------+ 
| family_id | first_name | last_name | 
+-----------+------------+-----------+ 
|   1 | a   | b   | 
|  10 | e   | f   | 
+-----------+------------+-----------+ 

或者

+-----------+------------+-----------+ 
| family_id | first_name | last_name | 
+-----------+------------+-----------+ 
|   1 | a   | b   | 
|  10 | gg   | hh  | 
+-----------+------------+-----------+ 

我如何編寫查詢才能得到正確的結果呢?

下面是創建測試表的腳本。

USE tempdb 
GO 
IF OBJECT_ID('dbo.mytable') IS NOT NULL DROP TABLE dbo.mytable; 
CREATE TABLE mytable (
    first_name VARCHAR(2) NOT NULL, 
    last_name VARCHAR(2) NOT NULL, 
    family_id INTEGER NOT NULL, 
    is_primary INTEGER NOT NULL, 
    is_secondary INTEGER NOT NULL); 

INSERT INTO mytable VALUES ('a','b',1,1,0); 
INSERT INTO mytable VALUES ('aa','bb',1,0,0); 
INSERT INTO mytable VALUES ('c','d',1,0,0); 
INSERT INTO mytable VALUES ('cc','dd',1,0,0); 
INSERT INTO mytable VALUES ('e','f',10,0,0); 
INSERT INTO mytable VALUES ('e','f',10,0,1); 
INSERT INTO mytable VALUES ('gg','hh',10,0,1); 
INSERT INTO mytable VALUES ('gg','hh',10,0,0); 
INSERT INTO mytable VALUES ('gg','hh',10,0,0); 
INSERT INTO mytable VALUES ('gg','hh',10,0,0); 
GO 

SELECT * FROM dbo.mytable; 
+0

你試過了什麼 –

+0

是的我試圖解決它,但失敗了。讓我更新這個問題。 –

+0

如果你想要第一個結果,那麼它不需要任何的努力,簡單的使用它:從mytable 組中選擇family_id,min(first_name),min(last_name) family_id –

回答

2

試試這個辦法:

;with x as (
    select *, row_number() over(partition by family_id order by is_primary desc, is_secondary desc) rn 
    from mytable 
    where is_primary+is_secondary = 1 
) 
select * from x where rn = 1 

(感謝創造&插入腳本)

編輯: 按OP評論(這兩個標誌可能是1),改變WHERE子句如下:

where is_primary = 1 or (is_primary = 0 and is_secondary = 1) 
+0

由於'OP'提到'is_primary'和'is_secondary'都可以1,where條件需要改爲'> ='1 – Eric

+0

也不是隨機選擇的(你可以通過非確定性的命令來排序,比如RAND) – Caleth

+0

@ Caleth任何沒有顯式ORDER BY子句的選擇都是非確定性的,你不同意嗎?請記住,有「隨機」和「隨機」,不同級別的「隨機性」和不同的相關成本。順便說一句,RAND()並不是隨機的,CHECKSUM(NEWID())在這裏會更好。 – dean

1

如果所選行必須爲b Ë隨機的,那麼使用以下命令:

WITH primary_families AS (
    SELECT family_id 
      ,first_name 
      ,last_name 
      ,ROW_NUMBER() OVER(ORDER BY NEWID()) AS r 
    FROM familytable 
    WHERE is_primary = 1 
), 
secondary_families AS (
    SELECT family_id 
      ,first_name 
      ,last_name 
      ,ROW_NUMBER() OVER(ORDER BY NEWID()) AS r 
    FROM familytable f 
    WHERE is_secondary = 1 
    AND NOT EXISTS (
     SELECT 1 
     FROM familytable 
     WHERE family_id = f.family_id 
     AND is_primary = 1 
    ) 
) 

SELECT f.family_id 
     ,f.first_name 
     ,f.last_name 
FROM primary_families f 
WHERE f.r = 1 

UNION 

SELECT f.family_id 
     ,f.first_name 
     ,f.last_name 
FROM secondary_families f 
WHERE f.r = 1 
0

這不是一個回答您的具體問題,只是一個觀察。如果我必須用這樣的邏輯開發一個軟件或Web應用程序,我會把它從SQL移到可用的編程語言。檢索感興趣的數據集,掃描它,分組並分類。

相關問題