0

我試圖隨機連接兩個表(TableA和TableB)的行,使TableA中的每一行都只連接到一個TableB中的每一行,並且TableB中的每一行都連接到TableA中的至少一行。意外的結果使用CTE在所有行的兩個表上執行隨機連接一對多

例如,隨機加入與5個不同的行和表B具有3個不同的行表A的應導致這樣的事情:

TableA TableB 
1  3 
2  1 
3  1 
4  2 
5  1 

然而,有時不是所有的從表B中的行包括在最終結果;因此在上面的示例中可能會丟失TableB中的第2行,因爲它的位置是第1行或第3行連接到TableA上的第4行。您可以通過多次執行腳本並檢查結果來看到發生這種情況。似乎有必要使用臨時表(@Q)來確保返回包含TableA和TableB中所有行的正確結果。

有人可以解釋爲什麼會發生這種情況嗎?

另外,有人可以請建議什麼會是一個更好的方式來獲得所需的結果?

據我所知,有時沒有結果返回,因爲交叉應用中的某種失敗以及我尚未確定的順序,並且我確信有更好的方法來執行此操作。我希望這是有道理的。提前致謝!

declare @TableA table (
     ID int 
     ); 
    declare @TableB table (
     ID int 
     ); 
    declare @Q table (
     RN int, 
     TableAID int, 
     TableBID int 
     ); 

    with cte as (
     select 
      1 as ID 
     union all 
     select 
      ID + 1 
     from cte 
     where ID < 5 
     ) 
    insert @TableA (ID) 
    select ID from cte; 

    with cte as (
     select 
      1 as ID 
     union all 
     select 
      ID + 1 
     from cte 
     where ID < 3 
     ) 
    insert @TableB (ID) 
    select ID from cte; 

    select * from @TableA; 
    select * from @TableB; 

    with cte as (
     select 
      row_number() over (partition by TableAID order by newid()) as RN, 
      TableAID, 
      TableBID 
     from (
      select 
       a.ID as TableAID, 
       b.ID as TableBID 
      from @TableA as a 
      cross apply @TableB as b 
      ) as M 
     ) 
    select --All rows from TableB not always included 
     TableAID, 
     TableBID 
    from cte 
    where RN in (
     select 
      top 1 
       iCTE.RN 
     from cte as iCTE 
     group by iCTE.RN 
     having count(distinct iCTE.TableBID) = (
      select count(1) from @TableB 
      ) 
     ) 
    order by TableAID; 

    with cte as (
     select 
      row_number() over (partition by TableAID order by newid()) as RN, 
      TableAID, 
      TableBID 
     from (
      select 
       a.ID as TableAID, 
       b.ID as TableBID 
      from @TableA as a 
      cross apply @TableB as b 
      ) as M 
     ) 
    insert @Q 
    select 
     RN, 
     TableAID, 
     TableBID 
    from cte; 

    select * from @Q; 

    select --All rows from both TableA and TableB included 
     TableAID, 
     TableBID 
    from @Q 
    where RN in (
     select 
      top 1 
       iQ.RN 
     from @Q as iQ 
     group by iQ.RN 
     having count(distinct iQ.TableBID) = (
      select count(1) from @TableB 
      ) 
     ) 
    order by TableAID; 

回答

1

看看這給了你,你找什麼?

DECLARE 
    @CountA INT = (SELECT COUNT(*) FROM @TableA ta), 
    @CountB INT = (SELECT COUNT(*) FROM @TableB tb), 
    @MinCount INT; 

SELECT @MinCount = CASE WHEN @CountA < @CountB THEN @CountA ELSE @CountB END; 

WITH 
    cte_A1 AS (
     SELECT 
      *, 
      rn = ROW_NUMBER() OVER (ORDER BY NEWID()) 
     FROM 
      @TableA ta 
     ), 
    cte_B1 AS (
     SELECT 
      *, 
      rn = ROW_NUMBER() OVER (ORDER BY NEWID()) 
     FROM 
      @TableB tb 
     ), 
    cte_A2 AS (
     SELECT 
      a1.ID, 
      rn = CASE WHEN a1.rn > @MinCount THEN a1.rn - @MinCount ELSE a1.rn end 
     FROM 
      cte_A1 a1 
     ), 
    cte_B2 AS (
     SELECT 
      b1.ID, 
      rn = CASE WHEN b1.rn > @MinCount THEN b1.rn - @MinCount ELSE b1.rn end 
     FROM 
      cte_B1 b1 
     ) 
SELECT 
    A = a.ID, 
    B = b.ID 
FROM 
    cte_A2 a 
    JOIN cte_B2 b 
     ON a.rn = b.rn; 
+0

看起來像一個很好的解決方案,以防止在沒有返回結果的事件,由於沒有全套TableBID值的產生;但是,在我的機器上(我正在使用MSSMS 2008 R2版本10.50.4000.0)仍然需要使用臨時表來確保每個表的完整集都被返回; (否則,TableAID在8/10次執行中缺失一個隨機值):插入@Q select cn_A2中的rn,ID,NULL union select rn,NULL,cte_B2中的ID select * from @Q select q1.TableAID,q2.TableBID from @Q as q1 join @Q as q2 on q1.RN = q2.RN其中q1.TableAID不爲NULL且q2.TableBID不爲NULL; – Erg

+0

我無法訪問2008R2實例進行驗證,但是在代碼中看不到與2008R2不兼容的任何內容。 –

+0

好的;所以這個答案看起來像是我最好的,所以我會接受它;我只是重新確認,無論出於什麼原因都需要使用臨時表來確保每個表的完整集都按照我以前的評論中的指示返回。 – Erg

相關問題