3
使用postgres 9.3,我有一個名爲regression_runs的表,它存儲了一些計數器。當更新,插入或刪除此表中的某行時,會調用觸發器函數來更新nightly_runs表中的行,以便爲具有給定ID的所有regression_runs保持這些計數器的運行總數。我採取的方法是相當廣泛的文件。但是,我的問題是,當多個進程嘗試同時在具有相同nightly_run_id的regression_runs表中插入新行時,我遇到了死鎖。postgresql觸發函數中的死鎖
的regression_runs表看起來像這樣:
regression=> \d regression_runs
Table "public.regression_runs"
Column | Type | Modifiers
-----------------+--------------------------+--------------------------------------------------------------
id | integer | not null default nextval('regression_runs_id_seq'::regclass)
username | character varying(16) | not null
nightly_run_id | integer |
nightly_run_pid | integer |
passes | integer | not null default 0
failures | integer | not null default 0
errors | integer | not null default 0
skips | integer | not null default 0
Indexes:
"regression_runs_pkey" PRIMARY KEY, btree (id)
"regression_runs_nightly_run_id_idx" btree (nightly_run_id)
Foreign-key constraints:
"regression_runs_nightly_run_id_fkey" FOREIGN KEY (nightly_run_id) REFERENCES nightly_runs(id) ON UPDATE CASCADE ON DELETE CASCADE
Triggers:
regression_run_update_trigger AFTER INSERT OR DELETE OR UPDATE ON regression_runs FOR EACH ROW EXECUTE PROCEDURE regression_run_update()
的nightly_runs表看起來像這樣:
regression=> \d nightly_runs
Table "public.nightly_runs"
Column | Type | Modifiers
------------+--------------------------+-----------------------------------------------------------
id | integer | not null default nextval('nightly_runs_id_seq'::regclass)
passes | integer | not null default 0
failures | integer | not null default 0
errors | integer | not null default 0
skips | integer | not null default 0
Indexes:
"nightly_runs_pkey" PRIMARY KEY, btree (id)
Referenced by:
TABLE "regression_runs" CONSTRAINT "regression_runs_nightly_run_id_fkey" FOREIGN KEY (nightly_run_id) REFERENCES nightly_runs(id) ON UPDATE CASCADE ON DELETE CASCADE
的觸發功能regression_run_update是這樣的:
CREATE OR REPLACE FUNCTION regression_run_update() RETURNS "trigger"
AS $$
BEGIN
IF TG_OP = 'UPDATE' THEN
IF (NEW.nightly_run_id IS NOT NULL) and (NEW.nightly_run_id = OLD.nightly_run_id) THEN
UPDATE nightly_runs SET passes = passes + (NEW.passes - OLD.passes), failures = failures + (NEW.failures - OLD.failures), errors = errors + (NEW.errors - OLD.errors), skips = skips + (NEW.skips - OLD.skips) WHERE id = NEW.nightly_run_id;
ELSE
IF NEW.nightly_run_id IS NOT NULL THEN
UPDATE nightly_runs SET passes = passes + NEW.passes, failures = failures + NEW.failures, errors = errors + NEW.errors, skips = skips + NEW.skips WHERE id = NEW.nightly_run_id;
END IF;
IF OLD.nightly_run_id IS NOT NULL THEN
UPDATE nightly_runs SET passes = passes - OLD.passes, failures = failures - OLD.failures, errors = errors - OLD.errors, skips = skips - OLD.skips WHERE id = OLD.nightly_run_id;
END IF;
END IF;
ELSIF TG_OP = 'INSERT' THEN
IF NEW.nightly_run_id IS NOT NULL THEN
UPDATE nightly_runs SET passes = passes + NEW.passes, failures = failures + NEW.failures, errors = errors + NEW.errors, skips = skips + NEW.skips WHERE id = NEW.nightly_run_id;
END IF;
ELSIF TG_OP = 'DELETE' THEN
IF OLD.nightly_run_id IS NOT NULL THEN
UPDATE nightly_runs SET passes = passes - OLD.passes, failures = failures - OLD.failures, errors = errors - OLD.errors, skips = skips - OLD.skips WHERE id = OLD.nightly_run_id;
END IF;
END IF;
RETURN NEW;
END;
$$
LANGUAGE plpgsql;
我在看什麼postgres日誌文件是這樣的:
ERROR: deadlock detected
DETAIL: Process 20266 waits for ShareLock on transaction 7520; blocked by process 20263.
Process 20263 waits for ExclusiveLock on tuple (1,70) of relation 18469 of database 18354; blocked by process 20266.
Process 20266: insert into regression_runs (username, nightly_run_id, nightly_run_pid) values ('tbeadle', 135, 20262);
Process 20263: insert into regression_runs (username, nightly_run_id, nightly_run_pid) values ('tbeadle', 135, 20260);
HINT: See server log for query details.
CONTEXT: SQL statement "UPDATE nightly_runs SET passes = passes + NEW.passes, failures = failures + NEW.failures, errors = errors + NEW.errors, skips = skips + NEW.skips WHERE id = NEW.nightly_run_id"
PL/pgSQL function regression_run_update() line 16 at SQL statement
STATEMENT: insert into regression_runs (username, nightly_run_id, nightly_run_pid) values ('tbeadle', 135, 20262);
我可以用這個腳本重現該問題:
#!/usr/bin/env python
import os
import multiprocessing
import psycopg2
class Foo(object):
def child(self):
pid = os.getpid()
conn = psycopg2.connect(
'dbname=regression host=localhost user=regression')
cur = conn.cursor()
for i in xrange(100):
cur.execute(
"insert into regression_runs "
"(username, nightly_run_id, nightly_run_pid) "
"values "
"('tbeadle', %s, %s);", (self.nid, pid))
conn.commit()
return
def start(self):
conn = psycopg2.connect(
'dbname=regression host=localhost user=regression')
cur = conn.cursor()
cur.execute('insert into nightly_runs default values returning id;')
row = cur.fetchone()
conn.commit()
self.nid = row[0]
procs = []
for child in xrange(5):
procs.append(multiprocessing.Process(target=self.child))
for proc in procs:
proc.start()
for proc in procs:
proc.join()
Foo().start()
我想不通爲什麼僵局正在發生或什麼我可以做些什麼。請幫忙!
恕我直言,更新觸發器內的字段是一個壞主意。由於觸發器拉手經常試圖寫入一行,並且它變成死鎖。 Mb需要架構更改。對於困難的情況,我創建緩衝隊列表並通過存儲過程分派它。當然,使用隊列調節的外部工具。 – corvinusz
@corvinusz:廢話。觸發器是OP正在做的理想工具。他只是不知道幾個陷阱。 –