开发者

How I do a select a limited number of rows with the same attribute?

开发者 https://www.devze.com 2023-03-17 06:15 出处:网络
I have a table of URLs and one attribute is the domain. Supposing I have 100 URLs from Google, 100 from Facebook, 100 from Ebay and the same for others domains, but I want to retrieve the first 30 UR

I have a table of URLs and one attribute is the domain.

Supposing I have 100 URLs from Google, 100 from Facebook, 100 from Ebay and the same for others domains, but I want to retrieve the first 30 URLs from Google, 30 from Faceboo开发者_高级运维k, 30 from Ebay and 30 from the others limiting to a max of 500 URLs.

How I do this?


The following SQL resolves my case, but the urls are out of order because the row_number don't follow the orders. I think this SQL need some improvement.

SELECT url,row_number FROM(
     SELECT url,row_number() OVER (PARTITION BY domain) FROM website
     WHERE domain IN
    (SELECT DISTINCT domain FROM link)  
) AS links 
WHERE row_number <= 10
LIMIT 25


How about something like this:

SELECT url FROM link WHERE domain='Google' LIMIT 30
UNION
SELECT url FROM link WHERE domain='Facebook' LIMIT 30
UNION
SELECT ...

etc.


My solution in PL/pgSQL based. I found a better way, that you don't need additional temporary table (using INSERT to that table in function's body), that is you can use that function to get result directly:

CREATE OR REPLACE FUNCTION getURLs(singleLimit integer, totalLimit integer)
RETURNS SETOF RECORD AS $$
DECLARE
    dom text;
    nrOfDomains integer;
    i integer;
    lim integer;
    remainder integer;
BEGIN
    nrOfDomains := totalLimit/singleLimit; -- integer division (truncate)
    remainder := totalLimit%singleLimit;

    IF remainder <> 0 THEN
        nrOfDomains := nrOfDomains + 1;
    END IF;

    i := 1;
    FOR dom IN SELECT DISTINCT domain FROM website LIMIT nrOfDomains
    LOOP
        IF i = nrOfDomains AND remainder <> 0 THEN
            lim := remainder;
        ELSE
            lim := singleLimit;
        END IF;

        RETURN QUERY SELECT * FROM website WHERE domain = dom LIMIT lim;

        i := i + 1;
    END LOOP;
    RETURN;
END $$
LANGUAGE 'plpgsql';

Here is some test-drive:

postgres=> CREATE TABLE website(url text, domain text);
CREATE TABLE

postgres=> INSERT INTO website
    SELECT 'http://' || d.column1  ||'/' || n, d.column1
    FROM generate_series(1, 100) n CROSS JOIN
    (VALUES ('google'), ('facebook'), ('ebay')) d;
INSERT 0 300

postgres=> SELECT * FROM getURLs(10, 25) website(url text, domain text);

Result:

        url         |  domain
--------------------+----------
 http://google/1    | google
 http://google/2    | google
 http://google/3    | google
 http://google/4    | google
 http://google/5    | google
 http://google/6    | google
 http://google/7    | google
 http://google/8    | google
 http://google/9    | google
 http://google/10   | google
 http://facebook/1  | facebook
 http://facebook/2  | facebook
 http://facebook/3  | facebook
 http://facebook/4  | facebook
 http://facebook/5  | facebook
 http://facebook/6  | facebook
 http://facebook/7  | facebook
 http://facebook/8  | facebook
 http://facebook/9  | facebook
 http://facebook/10 | facebook
 http://ebay/1      | ebay
 http://ebay/2      | ebay
 http://ebay/3      | ebay
 http://ebay/4      | ebay
 http://ebay/5      | ebay
(25 rows)
0

精彩评论

暂无评论...
验证码 换一张
取 消