开发者

Selecting every Nth row per user in Postgres

开发者 https://www.devze.com 2023-04-07 08:59 出处:网络
I was using this SQL statement: SELECT \"dateId\", \"userId\", \"Salary\"开发者_JAVA技巧 FROM ( SELECT *,

I was using this SQL statement:

SELECT "dateId", "userId", "Salary"开发者_JAVA技巧 
FROM (
   SELECT *, 
          (row_number() OVER (ORDER BY "userId", "dateId"))%2 AS rn 
   FROM user_table
 ) sa 
 WHERE sa.rn=1 
   AND "userId" = 789 
   AND "Salary" > 0;

But every time the table gets new rows the result of the query is different.

Am I missing something?


Assuming that ("dateId", "userId") is unique and new rows always have a bigger (later) dateId.

After some comments:

What I think you need:

SELECT "dateId", "userId", "Salary"
FROM (
   SELECT "dateId", "userId", "Salary"
         ,(row_number() OVER (PARTITION BY "userId"   -- either this
                              ORDER BY "dateId")) % 2 AS rn
   FROM   user_table
   WHERE  "userId" = 789                              -- ... or that
   ) sub
WHERE  sub.rn = 1
AND    "Salary" > 0;

Notice the PARTITION BY. This way you skip every second dateId for each userId, and additional (later) rows don't change the selection so far.

Also, as long as you are selecting rows for a single userId (WHERE "userId" = 789), pull the predicate into the subquery, achieving the same effect (stable selection for a single user). You don't need both.

The WHERE clause in the subquery only works for a single user, PARTITION BY works for any number of users in one query.

Is that it? Is it?
They should give me "detective" badge for this.
Seriously.


No that seems to be OK. You have new rows, those rows change the old rows to appear on different position after sorting.


If someone insert a new row with a userId below 789 the order will change. For example, if you have:

userId rn
 1      1
 4      0
 5      1
 6      0

and you insert a row with userId = 2, the rn will change:

userId rn
 1      1
 2      0
 4      1
 5      0
 6      1

In order to select every Nth row you need a column with a sequence or a timestamp.

0

精彩评论

暂无评论...
验证码 换一张
取 消