3

這可能不是一個簡單的Firebird問題,但我希望有一個功能我不知道,可以幫助我超越普通香草SQL。具有挑戰性的火鳥遞歸CTE問題

我有兩個表。首先是名稱的列表「關鍵參數」,而第二個問題涉及特定對象ID,關鍵參數的名稱,以及關鍵的參數值:

CREATE TABLE CRITICALPARAMS 
(
PARAM Varchar(32) NOT NULL, 
INDX INTEGER NOT NULL, 
CONSTRAINT PK_CRITICALPARAMS_1 PRIMARY KEY (PARAM), 
CONSTRAINT UNQ_CRITICALPARAMS_1 UNIQUE (INDX) 
); 

CREATE TABLE CRITICALPARAMVALS 
(
ID INTEGER NOT NULL, 
PARAM Varchar(32) NOT NULL, 
VAL Float NOT NULL, 
CONSTRAINT PK_CRITICALPARAMVALS_1 PRIMARY KEY (ID,PARAM) 
); 

讓我們假設我們有一個四維空間:

insert into CRITICALPARAMS values ('a', 1); 
insert into CRITICALPARAMS values ('b', 2); 
insert into CRITICALPARAMS values ('c', 3); 
insert into CRITICALPARAMS values ('foo', 4); 

...和物體在空間屈指可數:

insert into CRITICALPARAMVALS values (1, 'a', 0.0); 
insert into CRITICALPARAMVALS values (1, 'b', 0.0); 
insert into CRITICALPARAMVALS values (1, 'c', 2.0); 
insert into CRITICALPARAMVALS values (1, 'foo', 99.0); 
insert into CRITICALPARAMVALS values (2, 'a', 0.0); 
insert into CRITICALPARAMVALS values (2, 'b', 0.0); 
insert into CRITICALPARAMVALS values (2, 'c', 2.0); 
insert into CRITICALPARAMVALS values (2, 'foo', 99.0); 
insert into CRITICALPARAMVALS values (3, 'a', 0.0); 
insert into CRITICALPARAMVALS values (3, 'b', 0.0); 
insert into CRITICALPARAMVALS values (3, 'c', 1.0); 
insert into CRITICALPARAMVALS values (3, 'foo', 98.0); 
insert into CRITICALPARAMVALS values (4, 'a', 0.0); 
insert into CRITICALPARAMVALS values (4, 'b', 0.0); 
insert into CRITICALPARAMVALS values (4, 'c', 1.0); 
insert into CRITICALPARAMVALS values (4, 'foo', 98.0); 
insert into CRITICALPARAMVALS values (5, 'a', 0.0); 
insert into CRITICALPARAMVALS values (5, 'b', 0.0); 
insert into CRITICALPARAMVALS values (5, 'c', 2.0); 
insert into CRITICALPARAMVALS values (5, 'foo', 98.0); 

的問題是分區關鍵參數空間中,分組具有相同PA所有對象ID一起rameter值。我們可以考慮使用「種子」對象ID,並詢問其他ID與種子對象屬於同一個分區。

在我們的例子中,對象1和2形成一個分區,3和4形成另一個分區,5個形成第三個分區。所有五個對象在關鍵參數a和b中是相等的,但參數c和foo不同。

有什麼辦法可以解決這個使用普通香草SQL?如何遞歸CTE?

我已經解決了這個問題,在存儲過程中使用EXECUTE STATEMENT,循環遍歷種子的關鍵參數值,並手動構建一個包含許多WHERE子句作爲關鍵參數的大SQL語句,但該解決方案不能縮放當我達到約500-1000個關鍵參數(或更多!)時。我首先定義了一個視圖,它可以給我一個沿着單個關鍵參數的分區(TEST_FLOAT_EQ是一個可選擇的存儲過程,它比較兩個浮點「足夠好!」等於):

CREATE VIEW VGROUPIDBYPARAM (SEEDID, GROUPMEMBERID, CRITPARAMINDX) 
AS 
select a.id as seedid, b.id as groupmemberid, c.INDX as critparamindx 
from CRITICALPARAMVALS a 
join CRITICALPARAMVALS b on a.PARAM=b.param and (exists (select isequal from TEST_FLOAT_EQ(a.val, b.val, 1e-5) where ISEQUAL=1)) 
join CRITICALPARAMS c on b.param=c.PARAM; 

...然後我想電感使用VGROUPIDBYPARAM認爲,類似下面的部分完成的選擇:

SELECT a1.SEEDID, a6.GROUPMEMBERID 
FROM VGROUPIDBYPARAM a1 
join VGROUPIDBYPARAM a2 on a1.SEEDID=a2.SEEDID and a1.GROUPMEMBERID=a2.GROUPMEMBERID 
join VGROUPIDBYPARAM a3 on a1.SEEDID=a3.SEEDID and a2.GROUPMEMBERID=a3.GROUPMEMBERID 
join VGROUPIDBYPARAM a4 on a1.SEEDID=a4.SEEDID and a3.GROUPMEMBERID=a4.GROUPMEMBERID 
join VGROUPIDBYPARAM a5 on a1.SEEDID=a5.SEEDID and a4.GROUPMEMBERID=a5.GROUPMEMBERID 
join VGROUPIDBYPARAM a6 on a1.SEEDID=a6.SEEDID and a5.GROUPMEMBERID=a6.GROUPMEMBERID 
... 
where a1.CRITPARAMINDX=1 
and a2.CRITPARAMINDX=2 
and a3.CRITPARAMINDX=3 
and a4.CRITPARAMINDX=4 
and a5.CRITPARAMINDX=5 
and a6.CRITPARAMINDX=6 
... 

在此歸納過程的結束(我希望遞歸CTE能夠模仿) ,通過一堆JOIN生成的唯一倖存記錄具有組成員ID與種子ID屬於同一個分區。

非常感謝任何能夠幫助我有效解決問題的人!

回答

2

爲了解決這個問題,我會用這個簡單的查詢開始(算上其他對象匹配的尺寸):

SELECT 
    CPV1.ID AS ID1, 
    CPV2.ID AS ID2, 
    COUNT(*) 
FROM 
    CRITICALPARAMVALS CPV1 
    INNER JOIN CRITICALPARAMVALS CPV2 ON CPV1.ID <> CPV2.ID 
      AND CPV1.PARAM = CPV2.PARAM 
      AND CPV1.VAL = CPV2.VAL 
GROUP BY 
    CPV1.ID, CPV2.ID 

用下面的輸出:

enter image description here

正如你所看到的,有趣的行標有黃色背景。

僅過濾那些行,我們應該添加此條件:

HAVING COUNT(*) = (SELECT COUNT(*) FROM CRITICALPARAMS) 

什麼其他ID 屬於同一個分區的種子,我們可以想到用一個「種子」對象ID,並要求的目的。

最後的查詢回答上述問題,與:SEED參數,如下所示:

SELECT 
    CPV2.ID 
FROM 
    CRITICALPARAMVALS CPV1 
    INNER JOIN CRITICALPARAMVALS CPV2 ON CPV1.ID <> CPV2.ID 
      AND CPV1.PARAM = CPV2.PARAM 
      AND CPV1.VAL = CPV2.VAL 
WHERE CPV1.ID = :SEED 
GROUP BY 
    CPV1.ID, CPV2.ID 
HAVING COUNT(*) = (SELECT COUNT(*) FROM CRITICALPARAMS) 

應該甚至大的數據集表現良好。

+0

這很好 - 它說明我們真正感興趣的是匹配的數量,而不是每個匹配參數的中間細節,就像我在嘗試中那樣......所以我的問題的標題是誤導性的,並顯示我真的很困惑! –