开发者

mysql where IN on large dataset or Looping?

开发者 https://www.devze.com 2023-02-06 11:44 出处:网络
I have the following scenario: Table 1: articles id article_textcategoryauthor_id 1\"hello world\"41 2 \"hi\"52

I have the following scenario:

 Table 1:

 articles

 id article_text  category  author_id
 1  "hello world"    4                1
 2 "hi"              5                2
 3 "wasup"           4                3


 Table 2

 authors

 id   name     friends_with
 1    "Joe"      "Bob"
 2    "Sue"      "Joe"
 3    "Fred"     "Bob"

I want to know the total number of authors that are friends with "Bob" for a given category.

So for example, for category 4 how many authors are there that are friends with "Bob".

The authors table is quite large, in some cases I have a million authors that are friends with "Bob"

So I have tried:

Get list of authors that are f开发者_开发问答riends with bob, and then loop through them and get the count for each of them of that given category and sum all those together in my code.

The issue with this approach is it can generate a million queries, even though they are very fast, it seems there should be a better way.

I was thinking of trying to get a list of authors that are friends with bob and then building an IN clause with that list, but I fear that would blow out the amt of memory allowed in the query set.

Seems like this is a common problem. Any ideas?

thanks


SELECT COUNT(DISTINCT auth.id)
FROM authors auth
INNER JOIN articles art ON auth.id = art.author_id
WHERE friends_with = 'bob' AND art.category = 4

Count(Distinct a.id) is required as articles might hit multiple rows for each author.

But if you have any control over the database I would use a link table for friends_with as your cussrent solution either have to use a comma seperated list of names which will be disastrous for performance and require a completly different query or each author can only have one friend.

Friends

id friend_id

then the query would look like this

SELECT COUNT(DISTINCT auth.id)
FROM authors auth
INNER JOIN articles art ON auth.id = art.author_id
INNER JOIN friends f ON auth.id = f.id 
INNER JOIN authors fauth ON fauth.id = f.friend_id
WHERE fauth.name = 'bob' AND art.category = 4

Its more complex but will allow for many friends, just remeber, this construct calls for 2 rows in friends for each pair, one from joe to bob and one from bob to joe.

You could build it differently but that would make the query even more complex.


Maybe something like

select fr.name, 
         fr.id, 
         au.name,
         ar.article_text,
         ar.category, 
         ar.author_id 
from authors fr, authors au, articles ar 
where fr.id = ar.author_id 
and au.friends_with = fr.name 
and ar.category = 4 ;

Just the count...

select count(distinct fr.name)  
from authors fr, authors au, articles ar 
where fr.id = ar.author_id 
and au.friends_with = fr.name 
and ar.category = 4 ;


A version without using joins (hopefully will work!)

SELECT count(distinct id) from authors where friends_with = 'Bob' and id in(select author_id from articles where category = 4)

I found it is easier to understand statements with 'IN' in when I started out with SQL.

0

精彩评论

暂无评论...
验证码 换一张
取 消