开发反馈一个简单的count语句接口压测超时,查看到该表上原本作为条件的索引被disable了,加上表中数据倾斜严重,导致不论传参在表中分布情况如何都会走全表扫
测试表
taria=
CREATE TABLE
taria=
CREATE INDEX
taria=
Table "public.skewtest"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
id | integer | | | | plain | |
record | character varying(50) | | | | extended | |
Indexes:
"idx_record" btree (record)
Access method: heap
造数据
ALTER TABLE skewtest ALTER COLUMN record TYPE varchar(50);
taria=
INSERT 0 100
taria=
id | record
0 | 515e2288a1f4472c24b1a5374c9ddc9a
1 | 4c4d0a0c541f4fc2ff017e621a0c291b
2 | a74a4a39a08649578319cc7066e404b5
3 | 68bd3b4c42b2143d88204e663022021c
4 | 1d81e8daf0f0e6c605a1e80aa271a41f
5 | f047ab023c79015541e4a4c5a35f915b
6 | 3878cf3b8774689868e04dbbe269e7c3
7 | 5635dc3b9306943b1711c20492ee132d
8 | 575677864bb988f1d3520ddc9d65348c
9 | c651c43aacf7fbece4be09bb733bf3c1
(10 rows)
taria=
INSERT 0 3350010
taria=
count
3350110
(1 row)
普通数据走索引
Aggregate (cost=4.58..4.58 rows=1 width=8) (actual time=0.148..0.148 rows=1 loops=1)
-> Index Only Scan using idx_record on skewtest (cost=0.56..4.57 rows=1 width=0) (actual time=0.139..0.141 rows=1 loops=1)
Index Cond: (record = '3878cf3b8774689868e04dbbe269e7c3'::text)
Heap Fetches: 0
Planning Time: 0.096 ms
Execution Time: 0.182 ms
(6 rows)
update pg_index set indisvalid = true where indexrelid ='idx_record':: regclass ;
强制普通数据走全表
Aggregate (cost=46366.59..46366.60 rows=1 width=8) (actual time=220.071..220.071 rows=1 loops=1)
-> Gather (cost=1000.00..46366.59 rows=1 width=0) (actual time=0.189..223.421 rows=1 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on skewtest (cost=0.00..45366.49 rows=1 width=0) (actual time=142.571..215.831 rows=0 loops=3)
Filter: ((record)::text = '3878cf3b8774689868e04dbbe269e7c3'::text)
Rows Removed by Filter: 1116703
Planning Time: 0.077 ms
Execution Time: 223.464 ms
倾斜数据走全表
explain analyze select count(*) from skewtest where record ='f047ab023c79015541e4a4c5a35f915b';
Finalize Aggregate (cost=49856.40..49856.41 rows=1 width=8) (actual time=402.832..402.832 rows=1 loops=1)
-> Gather (cost=49856.19..49856.40 rows=2 width=8) (actual time=402.714..405.983 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=48856.19..48856.20 rows=1 width=8) (actual time=398.518..398.518 rows=1 loops=3)
-> Parallel Seq Scan on skewtest (cost=0.00..45366.49 rows=1395879 width=0) (actual time=0.062..284.255 rows=1116670 loops=3)
Filter: ((record)::text = 'f047ab023c79015541e4a4c5a35f915b'::text)
Rows Removed by Filter: 33
Planning Time: 0.103 ms
Execution Time: 406.033 ms
(10 rows)
taria=
SET
这里不是不走全表扫,是走了个不带并发聚合的全表 更慢了
倾斜数据强制走索引
Aggregate (cost=10000078169.65..10000078169.66 rows=1 width=8) (actual time=1160.122..1160.122 rows=1 loops=1)
-> Seq Scan on skewtest (cost=10000000000.00..10000069794.38 rows=3350110 width=0) (actual time=0.049..820.449 rows=3350011 loops=1)
Filter: ((record)::text = 'f047ab023c79015541e4a4c5a35f915b'::text)
Rows Removed by Filter: 99
Planning Time: 0.089 ms
Execution Time: 1160.159 ms
(6 rows)
差异在Partial Aggregate 因为count带来的并行聚合,可以用set max_parallel_workers控制并发
为了测试安个pg_hint_plan
大数据量3000w+
explain analyze select count(*) from skewtest where record ='f047ab023c79015541e4a4c5a35f915b';
Finalize Aggregate (cost=489543.90..489543.91 rows=1 width=8) (actual time=2038.704..2039.813 rows=1 loops=1)
-> Gather (cost=489543.69..489543.90 rows=2 width=8) (actual time=2038.634..2039.807 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=488543.69..488543.70 rows=1 width=8) (actual time=2034.316..2034.316 rows=1 loops=3)
-> Parallel Seq Scan on skewtest (cost=0.00..453647.74 rows=13958379 width=0) (actual time=0.056..1412.934 rows=11166670 loops=3)
Filter: ((record)::text = 'f047ab023c79015541e4a4c5a35f915b'::text)
Rows Removed by Filter: 33
Planning Time: 0.314 ms
Execution Time: 2039.909 ms
(10 rows)
explain analyze select count(*) from skewtest where record ='f047ab023c79015541e4a4c5a35f915b';
QUERY PLAN
Finalize Aggregate (cost=825342.34..825342.35 rows=1 width=8) (actual time=2220.391..2227.868 rows=1 loops=1)
-> Gather (cost=825342.13..825342.34 rows=2 width=8) (actual time=2220.386..2227.864 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=824342.13..824342.14 rows=1 width=8) (actual time=2215.918..2215.919 rows=1 loops=3)
-> Parallel Index Scan using idx_record on skewtest (cost=0.56..789446.18 rows=13958379 width=0) (actual time=0.129..1593.646 rows=11166670 loops=3)
Index Cond: ((record)::text = 'f047ab023c79015541e4a4c5a35f915b'::text)
Planning Time: 0.097 ms
Execution Time: 2227.973 ms
(9 rows)
小数据量33w+
taria=
count
335110
(1 row)
1 | c2876aa9a0642bbc86b6955731b35a19
1 | c3a7199d80054e67b798e889b0618af4
1 | c40cd2b1e0dcb8a6aae65a81e3ccfe58
1 | c433904380e4ef6075b8cd1e5d8536f8
1 | c64c7c448294aec005e8d00b4113e96b
1 | c7fbf598f403f62ca704da0059d4c31a
1 | c8c6b566ff4ccd52ae523c303b4f50be
1 | cd28ad7edb887b46d29c576cae60a012
1 | d0026ad8f3753e5c52e1e1dbb7746b56
1 | d582ed9a85d618e60251989e91820179
1 | d76ad7e27cc0fa261554167c4e358fa0
1 | d7c555e9e731b724de2e7a07d861339c
1 | dec495dc37e497e92b3ac7b38993737f
1 | e4101c5bcd989a9b0af539184d477643
1 | e933bf160e13a94c8f66101e4949b5f1
1 | e9b75293969bf33e70a0571b9cbcc208
1 | ecb4306a179958491a9d6b94d22c31ab
1 | ef5c7f2f06ecd9f9316d20af59ce81f3
335010 | f047ab023c79015541e4a4c5a35f915b
1 | f255cf5361158e5847d843ac200d6442
1 | fe78bb64b36914d41fc92f0b18245bcf
set max_parallel_workers =0;
explain analyze select count(*) from skewtest2 where record ='f047ab023c79015541e4a4c5a35f915b';
QUERY PLAN
Finalize Aggregate (cost=6749.80..6749.81 rows=1 width=8) (actual time=72.599..72.623 rows=1 loops=1)
-> Gather (cost=6749.69..6749.80 rows=1 width=8) (actual time=72.595..72.619 rows=1 loops=1)
Workers Planned: 1
Workers Launched: 0
-> Partial Aggregate (cost=5749.69..5749.70 rows=1 width=8) (actual time=72.477..72.478 rows=1 loops=1)
-> Parallel Seq Scan on skewtest2 (cost=0.00..5257.04 rows=197058 width=0) (actual time=0.015..49.267 rows=335010 loops=1)
Filter: ((record)::text = 'f047ab023c79015541e4a4c5a35f915b'::text)
Rows Removed by Filter: 100
Planning Time: 0.119 ms
Execution Time: 72.644 ms
(10 rows)
explain analyze select count(*) from skewtest2 where record ='f047ab023c79015541e4a4c5a35f915b';
QUERY PLAN
Finalize Aggregate (cost=9883.59..9883.60 rows=1 width=8) (actual time=75.539..75.570 rows=1 loops=1)
-> Gather (cost=9883.48..9883.59 rows=1 width=8) (actual time=75.535..75.567 rows=1 loops=1)
Workers Planned: 1
Workers Launched: 0
-> Partial Aggregate (cost=8883.48..8883.49 rows=1 width=8) (actual time=75.417..75.417 rows=1 loops=1)
-> Parallel Index Scan using idx_record2 on skewtest2 (cost=0.42..8390.83 rows=197058 width=0) (actual time=0.019..51.857 rows=335010 loops=1)
Index Cond: ((record)::text = 'f047ab023c79015541e4a4c5a35f915b'::text)
Planning Time: 0.092 ms
Execution Time: 75.597 ms
没什么好总结,感觉就是全表好像没比走索引快多少(这个小数据集),根据表情况不同差距应该越明显,简单记录下。
|