1) собрать все логи с бд по локам
[postgres@p38rmisdb01 postgres]$ grep -i 'lock:' postgresql-2018-04-06.log
2) made a with query enclosing every row with ‘ and with ; at the end
—-best of all do this on different server
with b as ( select unnest(array[ '2018-06-05 00:03:33 KRAT [22266]: [17-1] db=lsd,appname=[unknown],user=app_group_master,client=192.168.5.111 DETAIL: Process holding the lock: 24957. Wait queue: 22266.', '2018-06-05 09:00:36 KRAT [5922]: [6-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 17343. Wait queue: 5922.', '2018-06-05 09:01:10 KRAT [5922]: [12-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 13309. Wait queue: 5922.', '2018-06-05 09:06:56 KRAT [9196]: [6-1] db=lsd,appname=192.168.4.72,user=n2o,client=192.168.5.111 DETAIL: Process holding the lock: 10136. Wait queue: .', '2018-06-05 09:09:55 KRAT [28905]: [4-1] db=lsd,appname=192.168.4.22,user=lsd,client=192.168.5.111 DETAIL: Process holding the lock: 24188. Wait queue: 28905.', '2018-06-05 09:10:08 KRAT [28905]: [11-1] db=lsd,appname=192.168.4.22,user=lsd,client=192.168.5.111 DETAIL: Process holding the lock: 24188. Wait queue: 28905.', '2018-06-05 09:11:07 KRAT [28905]: [15-1] db=lsd,appname=192.168.4.22,user=lsd,client=192.168.5.111 DETAIL: Process holding the lock: 24188. Wait queue: 28905.', '2018-06-05 10:00:06 KRAT [342]: [5-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 23155. Wait queue: 342.', '2018-06-05 10:00:41 KRAT [342]: [11-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 9430. Wait queue: 342.', '2018-06-05 10:01:21 KRAT [342]: [17-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 1941. Wait queue: 342.', '2018-06-05 10:09:23 KRAT [27511]: [411-1] db=lsd,appname=192.168.4.72,user=n2o,client=192.168.5.111 DETAIL: Process holding the lock: 18429. Wait queue: .', '2018-06-05 10:09:24 KRAT [18429]: [6-1] db=lsd,appname=192.168.4.72,user=n2o,client=192.168.5.111 DETAIL: Process holding the lock: 30049. Wait queue: .', '2018-06-05 10:09:24 KRAT [30049]: [4-1] db=lsd,appname=192.168.4.72,user=n2o,client=192.168.5.111 DETAIL: Process holding the lock: 18429. Wait queue: 30049.', '2018-06-05 11:00:43 KRAT [29866]: [6-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Processes holding the lock: 2667, 27024. Wait queue: 29866.', '2018-06-05 11:00:56 KRAT [29866]: [12-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 29718. Wait queue: 29866.', '2018-06-05 11:14:49 KRAT [13709]: [902-1] db=lsd,appname=192.168.4.72,user=n2o,client=192.168.5.111 DETAIL: Process holding the lock: 4172. Wait queue: .', '2018-06-05 11:14:49 KRAT [4172]: [4-1] db=lsd,appname=192.168.4.72,user=n2o,client=192.168.5.111 DETAIL: Process holding the lock: 13719. Wait queue: .', '2018-06-05 12:00:07 KRAT [24431]: [5-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 26119. Wait queue: 24431.', '2018-06-05 12:00:41 KRAT [24431]: [11-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 23751. Wait queue: 24431.', '2018-06-05 12:01:02 KRAT [24431]: [17-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 11215. Wait queue: 24431.', '2018-06-05 13:00:43 KRAT [19265]: [6-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 7261. Wait queue: 19265.', '2018-06-05 13:00:46 KRAT [19265]: [12-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 22841. Wait queue: 19265.', '2018-06-05 14:00:02 KRAT [13762]: [4-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 32362. Wait queue: 13762.', '2018-06-05 14:00:06 KRAT [13762]: [10-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 17264. Wait queue: 13762.', '2018-06-05 14:00:34 KRAT [13762]: [16-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 28253. Wait queue: 13762.', '2018-06-05 14:00:37 KRAT [13762]: [22-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 14070. Wait queue: 13762.', '2018-06-05 14:12:13 KRAT [30079]: [702-1] db=lsd,appname=192.168.4.72,user=n2o,client=192.168.5.111 DETAIL: Process holding the lock: 4418. Wait queue: .', '2018-06-05 15:00:02 KRAT [8188]: [4-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 17264. Wait queue: 8188.', '2018-06-05 15:00:07 KRAT [8188]: [10-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 1981. Wait queue: 8188.', '2018-06-05 15:00:40 KRAT [8188]: [16-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Processes holding the lock: 13534, 27378. Wait queue: 8188.', '2018-06-05 15:01:16 KRAT [8188]: [22-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 16850. Wait queue: 8188.', '2018-06-05 16:00:02 KRAT [2756]: [4-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Processes holding the lock: 1981, 25953. Wait queue: 2756.', '2018-06-05 16:00:07 KRAT [2756]: [10-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 23839. Wait queue: 2756.', '2018-06-05 16:00:34 KRAT [2756]: [16-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 8448. Wait queue: 2756.', '2018-06-05 16:01:07 KRAT [2756]: [22-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 6456. Wait queue: 2756.', '2018-06-05 17:00:02 KRAT [29439]: [4-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 23839. Wait queue: 29439.', '2018-06-05 17:00:29 KRAT [29439]: [11-1] db=lsd,appname=psql,user=postgres,client=[local] DETAIL: Process holding the lock: 3378. Wait queue: 29439.' ]) as txt), analyze_lock as ( select substring(txt, 1,19)::timestamp without time zone as ttime, split_part(split_part(txt, 'holding the lock:', 2), '. Wait', 1) as holding_lock, split_part(split_part(txt, 'Wait queue: ', 2), '.', 1) as wait_queue, txt from b ---where txt like '%3103%' --- здесь можно оганичить по пользователю, по процессу ), analyze_lock_normalize as ( --нормализованное предсталвение блокировок (все перечисленные через запятую разбиты на отдельные строки) SELECT ttime, unnest(string_to_array(regexp_replace(regexp_replace(cast((holding_lock) AS CHARACTER VARYING), '^\(',''), '\)$', ''), ','))::int AS holding_lock, unnest(string_to_array(regexp_replace(regexp_replace(cast((wait_queue) AS CHARACTER VARYING), '^\(',''), '\)$', ''), ','))::int AS wait_queue, txt from analyze_lock --WHERE ttime::time between '14:45:00'::time and '15:15:00'::time --- здесь можно оганичить по времени ) , --- обычная паррент таблица pp as ( select distinct holding_lock::int as id, null::int as parent_id, null::timestamp without time zone as ttime from analyze_lock_normalize t where not exists (select from analyze_lock_normalize t2 where t2.wait_queue = t.holding_lock ) union all select distinct wait_queue::int as h_id, holding_lock::int as parrent_id, ttime from analyze_lock_normalize t --where not exists (select from analyze_lock t2 where t2.wait_queue like '%'||t.holding_lock||'%' ) ), ltr as ( WITH RECURSIVE ltr_p AS ( SELECT id, parent_id, ttime, id::text::ltree as ltree_lock, id::text::ltree as main_lock, 1 as llevel FROM pp WHERE parent_id IS NULL UNION ALL SELECT n.id, n.parent_id, n.ttime, r.ltree_lock || (n.id::text::ltree), r.main_lock, r.llevel + 1 FROM pp n JOIN ltr_p r ON n.parent_id = r.id where r.llevel < 10 -- ограничиваем глубину погружения в локи (главное понять от кого они идут а не кого блокируют) ) SELECT *, min(ttime) over (partition by main_lock) as main_lock_time, max(ttime) over (partition by main_lock) as main_lock_time_max FROM ltr_p ORDER BY ltree_lock ) select replace(ltree_lock::text,'.',' --> ') as ltree_lock, min(ttime)/*-INTERVAL '5h'*/ as time_lock, main_lock, main_lock_time/*-INTERVAL '5h'*/, main_lock_time_max/*-INTERVAL '5h'*/ from ltr group by 1, 3, 4, 5 order by 4, 3, min(ttime) nulls first, 1 --- you can change " -INTERVAL '5h'" to any suitable value
Result is:
10040 10040 06.04.2018 4:38 06.04.2018 4:38 10040 --> 10018 06.04.2018 4:38 10040 06.04.2018 4:38 06.04.2018 4:38 10040 --> 10100 06.04.2018 4:38 10040 06.04.2018 4:38 06.04.2018 4:38 10040 --> 27005 06.04.2018 4:38 10040 06.04.2018 4:38 06.04.2018 4:38 20930 20930 06.04.2018 5:07 06.04.2018 5:07 20930 --> 21075 06.04.2018 5:07 20930 06.04.2018 5:07 06.04.2018 5:07 8548 8548 06.04.2018 6:41 06.04.2018 6:41 8548 --> 30636 06.04.2018 6:41 8548 06.04.2018 6:41 06.04.2018 6:41 27429 27429 06.04.2018 7:01 06.04.2018 7:01 27429 --> 3837 06.04.2018 7:01 27429 06.04.2018 7:01 06.04.2018 7:01 30857 30857 06.04.2018 7:02 06.04.2018 7:02 30857 --> 3837 06.04.2018 7:02 30857 06.04.2018 7:02 06.04.2018 7:02 10219 10219 06.04.2018 7:49 06.04.2018 7:57 10219 --> 31116 06.04.2018 7:49 10219 06.04.2018 7:49 06.04.2018 7:57 10219 --> 31116 --> 9138 06.04.2018 7:53 10219 06.04.2018 7:49 06.04.2018 7:57 10219 --> 31116 --> 31119 06.04.2018 7:55 10219 06.04.2018 7:49 06.04.2018 7:57 10219 --> 31116 --> 31119 --> 3410 06.04.2018 7:56 10219 06.04.2018 7:49 06.04.2018 7:57 10219 --> 31116 --> 9138 --> 31118 06.04.2018 7:57 10219 06.04.2018 7:49 06.04.2018 7:57 30944 30944 06.04.2018 7:56 06.04.2018 7:56 30944 --> 31114 06.04.2018 7:56 30944 06.04.2018 7:56 06.04.2018 7:56 31110 31110 06.04.2018 7:56 06.04.2018 7:56 31110 --> 12204 06.04.2018 7:56 31110 06.04.2018 7:56 06.04.2018 7:56 31110 --> 12204 --> 3411 06.04.2018 7:56 31110 06.04.2018 7:56 06.04.2018 7:56 15286 15286 06.04.2018 8:03 06.04.2018 8:03 15286 --> 13436 06.04.2018 8:03 15286 06.04.2018 8:03 06.04.2018 8:03 18662 18662 06.04.2018 8:04 06.04.2018 8:04 18662 --> 13436 06.04.2018 8:04 18662 06.04.2018 8:04 06.04.2018 8:04 22066 22066 06.04.2018 8:04 06.04.2018 8:04 22066 --> 13436 06.04.2018 8:04 22066 06.04.2018 8:04 06.04.2018 8:04 28885 28885 06.04.2018 9:02 06.04.2018 9:02 28885 --> 22676 06.04.2018 9:02 28885 06.04.2018 9:02 06.04.2018 9:02
as you can see the PID 10219 is the cause of cascade locks
4) also you can check what that PID do before locking
[postgres@p38rmisdb01 postgres]$ grep -i ‘\[10219\]’ postgresql-2018-04-06.log
ps: thx to Amir for material !