9

我在我的關係數據庫(Firebird)中有一個DAG,它有兩個表edgenode(鄰接列表模型)。我想遞歸查詢它們,但發現遞歸查詢效率很低。所以我試圖實施觸發器來保持Dong et.al之後的傳遞閉包。紙http://homepages.inf.ed.ac.uk/libkin/papers/tc-sql.pdf如何有效維護傳遞閉包表?

SELECT s現在非常快,但是DELETE s非常慢,因爲幾乎整個圖都被複製爲單個刪除。更糟的是,併發更新似乎不可能。

有沒有更好的實現方法?

編輯

我做了一些實驗,並引入了引用計數器的TC表。因此,刪除速度很快。我寫了一些簡單的測試用例,但我不確定我是否正確。這是我到目前爲止:

CREATE GENERATOR graph_tc_seq; 

CREATE TABLE EDGE (
    parent DECIMAL(10, 0) NOT NULL, 
    child DECIMAL(10, 0) NOT NULL, 
    PRIMARY KEY (parent, child) 
); 

CREATE TABLE GRAPH_TC (
    parent DECIMAL(10, 0) NOT NULL, 
    child DECIMAL(10, 0) NOT NULL, 
    refcount DECIMAL(9, 0), 
    PRIMARY KEY (parent, child) 
); 

CREATE TABLE GRAPH_TC_TEMP (
    session_id DECIMAL(9, 0), 
    parent DECIMAL(10, 0), 
    child DECIMAL(10, 0) 
); 

CREATE PROCEDURE GRAPH_TC_CREATE (p_parent DECIMAL(10, 0), c_child DECIMAL(10, 0)) 
AS 
    declare variable tp_parent DECIMAL(10,0); 
    declare variable tc_child DECIMAL(10,0); 
    declare variable session_id DECIMAL(9,0); 
    declare variable refs DECIMAL(9,0); 
begin 
    session_id = gen_id(graph_tc_seq,1); 
    insert into graph_tc_temp (parent, child, session_id, refcount) values (:p_parent, :p_parent, :session_id, 1); 
    insert into graph_tc_temp (parent, child, session_id, refcount) values (:c_child, :c_child, :session_id, 1); 
    insert into graph_tc_temp (parent, child, session_id, refcount) values (:p_parent, :c_child, :session_id, 1); 
    insert into graph_tc_temp (parent, child, session_id, refcount) select distinct :p_parent, child, :session_id, refcount from graph_tc where parent = :c_child and not parent = child; 
    insert into graph_tc_temp (child, parent, session_id, refcount) select distinct :c_child, parent, :session_id, refcount from graph_tc where child = :p_parent and not parent = child; 
    insert into graph_tc_temp (parent, child, session_id, refcount) select distinct a.parent, b.child, :session_id, a.refcount*b.refcount from graph_tc a, graph_tc b where a.child = :p_parent and b.parent = :c_child and not a.parent = a.child and not b.parent = b.child; 
    for select parent, child, refcount from graph_tc_temp e where session_id= :session_id and exists (select * from graph_tc t where t.parent = e.parent and t.child = e.child) into :tp_parent, :tc_child, :refs do begin 
     update graph_tc set refcount=refcount+ :refs where parent = :tp_parent and child = :tc_child; 
    end 
    insert into graph_tc (parent, child, refcount) select parent, child, refcount from graph_tc_temp e where session_id = :session_id and not exists (select * from graph_tc t where t.parent = e.parent and t.child = e.child); 
    delete from graph_tc_temp where session_id = :session_id; 
end^

CREATE PROCEDURE GRAPH_TC_DELETE (p_parent DECIMAL(10, 0), c_child DECIMAL(10, 0)) 
AS 
    declare variable tp_parent DECIMAL(10,0); 
    declare variable tc_child DECIMAL(10,0); 
    declare variable refs DECIMAL(9,0); 
begin 
    delete from graph_tc where parent = :p_parent and child = :p_parent and refcount <= 1; 
    update graph_tc set refcount = refcount - 1 where parent = :p_parent and child = :p_parent and refcount > 1; 
    delete from graph_tc where parent = :c_child and child = :c_child and refcount <= 1; 
    update graph_tc set refcount = refcount - 1 where parent = :c_child and child = :c_child and refcount > 1; 
    delete from graph_tc where parent = :p_parent and child = :c_child and refcount <= 1; 
    update graph_tc set refcount = refcount - 1 where parent = :p_parent and child = :c_child and refcount > 1; 
    for select distinct :p_parent, b.child, refcount from graph_tc b where b.parent = :c_child and not b.parent = b.child into :tp_parent, :tc_child, :refs do begin 
     delete from graph_tc where parent = :tp_parent and child = :tc_child and refcount <= :refs; 
     update graph_tc set refcount = refcount - :refs where parent = :tp_parent and child = :tc_child and refcount > :refs; 
    end 
    for select distinct :c_child, b.parent, refcount from graph_tc b where b.child = :p_parent and not b.parent = b.child into :tc_child, :tp_parent, :refs do begin 
     delete from graph_tc where child = :tc_child and parent = :tp_parent and refcount <= :refs; 
     update graph_tc set refcount = refcount - :refs where child = :tc_child and parent = :tp_parent and refcount > :refs; 
    end 
    for select distinct a.parent, b.child, a.refcount*b.refcount from graph_tc a, graph_tc b where not a.parent = a.child and not b.parent = b.child and a.child = :p_parent and b.parent = :c_child into :tp_parent, :tc_child, :refs do begin 
     delete from graph_tc where parent = :tp_parent and child = :tc_child and refcount <= :refs; 
     update graph_tc set refcount = refcount - :refs where parent = :tp_parent and child = :tc_child and refcount > :refs; 
    end 
end^

CREATE TRIGGER GRAPH_TC_AFTER_INSERT FOR EDGE AFTER INSERT as 
begin 
    execute procedure graph_tc_create(new.parent,new.child); 
end^

CREATE TRIGGER GRAPH_TC_AFTER_UPDATE FOR EDGE AFTER UPDATE as 
begin 
    if ((new.parent <> old.parent) or (new.child <> old.child)) then begin 
    execute procedure graph_tc_delete(old.parent,old.child); 
    execute procedure graph_tc_create(new.parent,new.child); 
    end 
end^

CREATE TRIGGER GRAPH_TC_AFTER_DELETE FOR EDGE AFTER DELETE as 
begin 
    execute procedure graph_tc_delete(old.parent,old.child); 
end^

這是我自己的想法,但我認爲其他人已經實施了TC。他們做同樣的事情嗎?

我有一些測試用例,但我不確定是否可能與較大的圖形有不一致。

併發性如何,我認爲這種方法會失敗,當兩個同時事務想更新圖表,對不對?

編輯

,我發現了一些錯誤在我的代碼,我想與大家分享了固定的版本。

我發現了一篇很棒的文章:http://www.codeproject.com/Articles/22824/A-Model-to-Represent-Directed-Acyclic-Graphs-DAG-o。是否有更多有趣的文章或科學論文,有不同的方法?

+0

你可以包含DDL和觸發器定義的相關部分嗎? –

+0

@MarkRotteveel請參閱我的編輯 –

+2

考慮使用[GTT(全局臨時表)](http://www.firebirdsql.org/file/documentation/reference_manuals/reference_material/html/langrefupd25-ddl-table.html)而不是GRAPH_TC_TEMP「的正常表格 –

回答

1

我只是通過擴展到這裏描述的傳遞反射閉合表模型來修復慢速刪除操作: http://www.dba-oracle.com/t_sql_patterns_incremental_eval.htm。花費更多的工作來完全維護其中的路徑數量,但是當刪除從每個單獨刪除操作的6秒鐘變爲可忽略的時候,它付出了很大的代價(我現在可以刪除圖表中的每個關係,然後將它們全部添加回去在4,000個關係中總共需要14秒)。

+0

爲了加分,最短路徑長度可以保持與路徑總數相似http://www.tjhsst.edu/~rlatimer/acm/DatabaseSystems/ShortestDistanceinFirstOrderLogicSQLp698-pangTODS-Oct05.pdf – nclu

4

SQL不是處理圖形的正確工具。使用以下之一:

http://en.wikipedia.org/wiki/Graph_database

我很喜歡ArangoDB,至極有syntaxe接近的MongoDB。

+0

我知道圖形數據庫將是理想的解決方案;但對於每個<100k邊的兩個圖,我不添加新的數據庫。 –

0

嘗試創建相關的索引,其中子句(例如:child,parent)。

我對Firebird並不熟悉,但看看「firebird describe」是如何工作的,並檢查它是否正確使用索引以加快您在程序中的選擇速度。

即使在創建更新/刪除/插入性能時丟失的索引,在您的情況下,它可以改善結果。

+0

真正的實現有索引;我只是沒有將CREATE INDEX語句複製到上面的代碼中。 –