开发者

Optimize the PostgreSQL query

开发者 https://www.devze.com 2022-12-07 18:20 出处:网络
I have a table likes the below. idsession_idstart_timeanswer_time 11112022-12-06 13:40:502022-12-06 13:40:55

I have a table likes the below.

id       session_id     start_time                   answer_time
1          111          2022-12-06 13:40:50          2022-12-06 13:40:55 
2          111          2022-12-06 13:40:51          Null
3          111          2022-12-06 13:40:57      开发者_运维知识库    Null
4          222          2022-12-06 13:40:58          Null
5          222          2022-12-06 13:41:10          Null 
6          222          2022-12-06 13:41:10          Null    
7          333          2022-12-06 13:46:10         2022-12-06 13:46:15
8          333          2022-12-06 13:46:18         2022-12-06 13:46:20

There are three sessions in the table, with session ids 111, 222, and 333; Each session has multiple records, but the session_id is the same; and the session is successful or unsuccessfulis depends on answer_time is Null or not of the smallest id record of that session.

The id 1 and id 4 and id 7 records in the above sample table determine whether a session is successful or unsuccessful.

I have the below SQL to query it, and it works well.

WITH t AS
(
    SELECT DISTINCT ON (session_id)  start_time, answer_time
    FROM logs
            WHERE ((SELECT NOW() AT TIME ZONE 'UTC') - start_time < interval '24 HOURS')
    ORDER BY logs.session_id, id
)
SELECT 
       COUNT(*) FILTER (WHERE (answer_time IS NOT NULL)) AS sccess_count,
       COUNT(*) FILTER (WHERE (answer_time IS NULL)) AS fail_count
FROM t;

But if the DB table have about 50M records, the query taken 20 seconds, this is unacceptable in the production environment, how can I optimize it? My goal is less than 1 second for the 50M records.

0

精彩评论

暂无评论...
验证码 换一张
取 消