开发者

Omitting Wikipedia maintenance categories from sql query

开发者 https://www.devze.com 2023-02-07 04:21 出处:网络
I am retrieving a list of categories for a given page_title, but the results include categories such as:

I am retrieving a list of categories for a given page_title, but the results include categories such as:

"All_articles_to_be_split"

"Articles_with_unsourced_statements_from_July_2008"

"All_articles_with_specifically-marked_weasel-worded_phrases"

...etc...I wish to omit these types of categories that are for maintenance.

Here is an example sql call I am making:

SELEC开发者_开发知识库T categorylinks.cl_to 
  FROM categorylinks 
  JOIN page ON categorylinks.cl_from = page.page_id 
           AND page.page_namespace = 0 
           AND page.page_title = "Ice_hockey";

What am I missing in my query to omit the maintenance categories? Or will I have to manually parse these out of my results? Thanks.


I just did it manually like this:

SELECT categorylinks.cl_to 
FROM categorylinks 
JOIN page ON categorylinks.cl_from = page.page_id 
AND page.page_namespace = 0 
AND cl_to NOT LIKE '%Article%' 
AND cl_to NOT LIKE '%article%' 
AND cl_to NOT LIKE '%Wikipedia%' 
AND cl_to NOT LIKE '%redirect%' 
AND cl_to NOT LIKE '%page%' 
AND cl_to NOT LIKE '%Redirect%' 
AND page.page_title = "Ice_hockey";
0

精彩评论

暂无评论...
验证码 换一张
取 消