一个子查询优化

问题背景

同事写了一个接口,这个接口涉及到下面三个表:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
User:
- id
- username
- created_utc
- ...

UserLoginInfo
- id
- user_id
- created_utc
- ...

Project
- id
- user_id
- name
- ...

需求是查询出:用户,最近用户登录时间,以及创建的项目数。这里的一条UserLoginInfo记录,代表一次登录,Project就是用户创建的项目记录,可能有多条。

同事实现是用子查询,Python里实现大概如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
user_projects_query = (
db_session.query(func.count(Project.id).label("cnt"))
.filter(
Project.user_id == User.id,
Project.deleted == 0,
)
.label("user_projects_cnt")
)
login_utc_query = (
db_session.query(UserLoginInfo.created_utc)
.filter(UserLoginInfo.user_id == User.id)
.order_by(UserLoginInfo.id.desc())
.limit(1)
.label("latest_login_time")
)
default_login_query = func.COALESCE(login_utc_query, User.created_utc).label("default_login_time")

order_mapping = {
"login_time": lambda desc: (default_login_query.desc(), User.id.desc()) if desc else (default_login_query, User.id),
"user_projects": lambda desc: (user_projects_query.desc(), User.id.desc()) if desc else (user_projects_query, User.id),
"created_utc": lambda desc: (User.created_utc.desc(), User.id.desc()) if desc else (User.created_utc, User.id)
}
query = db_session.query(User, user_projects_query, default_login_query).filter_by(deleted=0).order_by(*condition)

这里的condition是在order_mapping中根据desc的值进行选择的

上面用Python写的代码翻译成raw sql就是

1
2
3
4
5
6
7
SELECT 
"user".id as user_id, "user".username as user_name,
(SELECT count(project.id) AS cnt FROM project WHERE project.user_id = "user".id AND project.deleted = 0) AS user_projects_cnt,
coalesce((SELECT user_login_info.created_utc FROM user_login_info WHERE user_login_info.user_id = "user".id ORDER BY user_login_info.id DESC LIMIT 1), "user".id) AS default_login_time
FROM "user"
WHERE "user".deleted = 0 ORDER BY default_login_time DESC, "user".ext_id DESC
LIMIT 10 OFFSET 0

分析

这一实现一打眼就知道肯定会有问题,因为有很多关联子查询,在数据集小的情况下体现不出来,一旦数据集稍有增大,速度就会大幅下降。因为一般来说,关联子查询在查询优化器里是不好进行优化的,最后出来的算法大概率是Nested loop

下面是在测试环境用explain执行后的结果,果然是Nested Loop占了相当长的时间

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Sort  (cost=750000347885.10..750000347885.29 rows=75 width=566)
Sort Key: ((SubPlan 2)) DESC, "user".id DESC
-> Bitmap Heap Scan on "user" (cost=8.75..750000347882.77 rows=75 width=566)
Recheck Cond: (ext_sys = 1)
Filter: (deleted = 0)
-> Bitmap Index Scan on idx_user_ext_sys (cost=0.00..8.73 rows=78 width=0)
Index Cond: (ext_sys = 1)
SubPlan 1
-> Aggregate (cost=19.27..19.28 rows=1 width=8)
-> Bitmap Heap Scan on document (cost=4.21..19.25 rows=5 width=4)
Recheck Cond: (user_id = "user".ext_id)
Filter: ((user_sys = 1) AND (deleted = 0))
-> Bitmap Index Scan on idx_user_id (cost=0.00..4.21 rows=8 width=0)
Index Cond: (user_id = "user".ext_id)
SubPlan 2
-> Limit (cost=10000004618.93..10000004618.94 rows=1 width=4)
-> Sort (cost=10000004618.93..10000005141.43 rows=209000 width=4)
Sort Key: user_login_info.created_utc DESC
-> Nested Loop (cost=10000000000.15..10000003573.93 rows=209000 width=4)
-> Seq Scan on user_login_info (cost=10000000000.00..10000000942.95 rows=1045 width=4)
Filter: (user_id = "user".id)
-> Materialize (cost=0.15..18.98 rows=200 width=0)
-> Index Only Scan using idx_document_user_sys on document document_1 (cost=0.15..17.98 rows=200 width=$
)
Index Cond: (user_sys = 1)

Planning time: 0.212 ms
Execution time: 229.674 ms

解决

那么现在首要问题就是如何避免子查询,可以看到需求里是需要最近用户登录时间用户的项目数,那么一个很自然的思路就是先把这两个数据查出来,然后再和 User Join到一起进行分页即可,这样就可以避免子查询嵌套到父查询里了,这里涉及到一个子查询的优化方法,尽量将关联子查询上推,上推到和父查询一个层级以避免 Nested Loop。

要实现先查出来某些数据,然后在后面的查询中使用,那么就是 CTE(公用表表达式) 了!公用表表达式,本质是允许用户通过一个子查询语句定义出一个临时表,然后在各个地方都可以使用这个临时表。

实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
user_projects_query = db_session.query(
Project.user_id, func.count(Project.id).label("cnt")
).filter(
Project.user_id == User.id,
Project.deleted == 0,
).group_by(Project.user_id).cte('user_projects_query')
project_count = func.COALESCE(user_projects_query.c.cnt, 0)

login_utc_query = db_session.query(
UserLoginInfo.user_id, func.max(UserLoginInfo.created_utc).label('login_utc')
).group_by(UserLoginInfo.user_id).cte('login_utc_query')
login_utc = func.COALESCE(login_utc_query.c.login_utc, User.created_utc).label('login_utc')

order_mapping = {
"login_time": lambda desc: (login_utc.desc(), User.id.desc()) if desc else (login_utc, User.id),
"user_docs": lambda desc: (document_count.desc(), User.id.desc()) if desc else (document_count, User.id),
"created_utc": lambda desc: (User.created_utc.desc(), User.id.desc()) if desc else (User.created_utc, User.id)
}

query = db_session.query(
User, document_count, login_utc
).join(
user_projects_query, user_projects_query.c.user_id == User.id, isouter=True
).join(
login_utc_query, login_utc_query.c.user_id == User.id, isouter=True
).filter(
User.deleted == 0,
).order_by(*condition)

对应的raw sql如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
WITH user_projects_query AS 
(SELECT project.user_id AS user_id, count(project.id) AS cnt
FROM project
WHERE project.deleted = 0 GROUP BY document.user_id
),
login_utc_query AS
(SELECT user_login_info.user_id AS user_id, max(user_login_info.created_utc) AS login_utc
FROM user_login_info GROUP BY user_login_info.user_id
)
SELECT "user".id AS user_id, "user".username AS username, user_projects_query.cnt AS user_projects_query_cnt, coalesce(login_utc_query.login_utc, "user".created_utc) AS login_utc
FROM "user" LEFT OUTER JOIN user_projects_query ON user_projects_query.user_id = "user".id LEFT OUTER JOIN login_utc_query ON login_utc_query.user_id = "user".id
WHERE "user".deleted = 0
ORDER BY login_utc DESC, "user".ext_id DESC
LIMIT 20 OFFSET 0

在测试环境跑一下 EXPLAIN ANALYSE:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
 Limit  (cost=2832.33..2832.38 rows=20 width=28) (actual time=11.156..11.162 rows=20 loops=1)
CTE user_projects_query
-> HashAggregate (cost=31.53..31.88 rows=35 width=12) (actual time=0.103..0.108 rows=26 loops=1)
Group Key: document.user_id
-> Bitmap Heap Scan on document (cost=9.69..30.69 rows=168 width=8) (actual time=0.018..0.074 rows=181 loops=1)
Recheck Cond: (user_sys = 1)
Filter: (deleted = 0)
Rows Removed by Filter: 29
Heap Blocks: exact=18
-> Bitmap Index Scan on idx_document_user_sys (cost=0.00..9.65 rows=200 width=0) (actual time=0.013..0.013 rows=211 loops=
1)
Index Cond: (user_sys = 1)
CTE login_utc_query
-> GroupAggregate (cost=0.29..2760.73 rows=46 width=8) (actual time=1.135..10.853 rows=70 loops=1)
Group Key: user_login_info.user_id
-> Index Scan using idx_user_login_info_user_id on user_login_info (cost=0.29..2515.61 rows=48932 width=8) (actual time=0.005..5
.915 rows=48804 loops=1)
-> Sort (cost=39.73..39.99 rows=107 width=28) (actual time=11.155..11.158 rows=20 loops=1)
Sort Key: (COALESCE(login_utc_query.login_utc, "user".created_utc)) DESC, "user".ext_id DESC
Sort Method: top-N heapsort Memory: 27kB
-> Hash Left Join (cost=2.78..36.88 rows=107 width=28) (actual time=11.028..11.130 rows=107 loops=1)
Hash Cond: ("user".id = login_utc_query.user_id)
-> Hash Left Join (cost=1.28..34.54 rows=107 width=28) (actual time=0.137..0.210 rows=107 loops=1)
Hash Cond: ("user".id = user_projects_query.user_id)
-> Index Scan using user_pkey on "user" (cost=0.14..32.67 rows=107 width=20) (actual time=0.014..0.066 rows=107 loops=
1)
Filter: (deleted = 0)
Rows Removed by Filter: 5
-> Hash (cost=0.70..0.70 rows=35 width=12) (actual time=0.119..0.119 rows=26 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> CTE Scan on user_projects_query (cost=0.00..0.70 rows=35 width=12) (actual time=0.104..0.115 rows=26 loops=1)
-> Hash (cost=0.92..0.92 rows=46 width=8) (actual time=10.887..10.887 rows=70 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 11kB
-> CTE Scan on login_utc_query (cost=0.00..0.92 rows=46 width=8) (actual time=1.136..10.875 rows=70 loops=1)
Planning time: 0.218 ms
Execution time: 11.211 ms

可以看到相比之前的子查询的229ms,使用cte的sql已经降到了11ms了

总结

不要在SELECT语句中滥用子查询

子查询只有必要的时候再用,使用时候应该注意考虑如何将子查询与SQL语句的主干进行融合,子查询不是独立的黑盒数据块,应该与主语句通盘考虑后再结合使用。

在使用子查询的地方,往往可以通过CTE表达式进行优化,核心思路就是将子查询提升到和父查询一样的层级,避免Nested Loop,而且CTE表达式可读性更好。

总之在一个SQL有比较多的子查询时候一定要小心,记得 EXPLAIN ANALYSE