开发者

Are you able to use a custom Postgres comparison function for ORDER BY clauses?

开发者 https://www.devze.com 2023-01-15 13:23 出处:网络
In Python, I can write a sort comparison function which returns an item in the set {-1, 0, 1} and pass it to a sort function like so:

In Python, I can write a sort comparison function which returns an item in the set {-1, 0, 1} and pass it to a sort function like so:

sorted(["some","data","with","a","nonconventional","sort"], custom_function)

This code will sort the sequence 开发者_JAVA百科according to the collation order I define in the function.

Can I do the equivalent in Postgres?

e.g.

SELECT widget FROM items ORDER BY custom_function(widget)

Edit: Examples and/or pointers to documentation are welcome.


Yes you can, you can even create an functional index to speed up the sorting.

Edit: Simple example:

CREATE TABLE foo(
    id serial primary key,
    bar int
);
-- create some data
INSERT INTO foo(bar) SELECT i FROM generate_series(50,70) i;
-- show the result
SELECT * FROM foo;

CREATE OR REPLACE FUNCTION my_sort(int) RETURNS int 
LANGUAGE sql 
AS
$$
    SELECT $1 % 5; -- get the modulo (remainder)
$$;
-- lets sort!
SELECT *, my_sort(bar) FROM foo ORDER BY my_sort(bar) ASC;

-- make an index as well:
CREATE INDEX idx_my_sort ON foo ((my_sort(bar)));

The manual is full of examples how to use your own functions, just start playing with it.

  • SQL: http://www.postgresql.org/docs/current/static/xfunc-sql.html
  • PL/pgSQL: http://www.postgresql.org/docs/current/static/plpgsql.html


We can avoid confusion about ordering methods using names:

  • "score function" of standard SQL select * from t order by f(x) clauses, and
  • "compare function" ("sort function" in the question text) of the Python's sort array method.

The ORDER BY clause of PostgreSQL have 3 mechanisms to sort:

  1. Standard, using an "score function", that you can use also with INDEX.
  2. Special "standard string-comparison alternatives", by collation configuration
    (only for text, varchar, etc. datatypes).
  3. ORDER BY ... USING clause. See this question or docs example.
    Example: SELECT * FROM mytable ORDER BY somecol USING ~<~ where ~<~ is an operator, that is embedding a compare function.

Perhaps "standard way" in a RDBMS (as PostgreSQL) is not like Python's standard because indexing is the aim of a RDBMS, and it's easier to index score functions.

Answers to the question:

  • Direct solution. There are no direct way to use an user-defined function as compare function, like in the sort method of languages like Python or Javascript.

  • Indirect solution. You can use a user-defined compare function in an user-defined operator, and an user-defined operator class to index it. See at PostgreSQL docs:

    • CREATE OPERATOR with the compare function;

    • CREATE OPERATOR CLASS, to be indexable.


Explaining compare functions

In Python, the compare function looks like this:

def compare(a, b):
    return 1 if a > b else 0 if a == b else -1

The compare function use less CPU tham a score function.
It is usefull also to express order when score funcion is unknown.

See a complete description at

  • for C language see https://www.gnu.org/software/libc/manual/html_node/Comparison-Functions.html
  • for Javascript see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/sort#Description

Other typical compare functions

Wikipedia's example to compare tuples:

function tupleCompare((lefta, leftb, leftc), (righta, rightb, rightc))
    if lefta ≠ righta
        return compare(lefta, righta)
    else if leftb ≠ rightb
        return compare(leftb, rightb)
    else
        return compare(leftc, rightc)

In Javascript:

function compare(a, b) {
  if (a is less than b by some ordering criterion) {
    return -1;
  }
  if (a is greater than b by the ordering criterion) {
    return 1;
  }
  // a must be equal to b
  return 0;
}

C++ example of PostgreSQL docs:

complex_abs_cmp_internal(Complex *a, Complex *b)
{
    double      amag = Mag(a),
                bmag = Mag(b);

    if (amag < bmag)
        return -1;
    if (amag > bmag)
        return 1;
    return 0;
}


You could do something like this

SELECT DISTINCT ON (interval_alias) *,
  to_timestamp(floor((extract('epoch' FROM index.created_at) / 10)) * 10) AT
  TIME ZONE 'UTC' AS interval_alias
  FROM index
  WHERE index.created_at >= '{start_date}'
  AND index.created_at <= '{end_date}'
  AND product = '{product_id}'
  GROUP BY id, interval_alias
ORDER BY interval_alias;

Firstly you define the parameter that will be your ordering column with AS. It could be function or any SQL expression. Then set it to ORDER BY expression and you're done!

In my opinion, this is the smoothest way to do such an ordering.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号