开发者

How do I write a named scope to filter by all of an array passed in, and not just by matching one element (using IN)

开发者 https://www.devze.com 2023-01-07 06:20 出处:网络
I have two models, Project and Category, which have a many-to-many relationship between them. The Project model is very simple:

I have two models, Project and Category, which have a many-to-many relationship between them. The Project model is very simple:

class Project < ActiveRecord::Base
  has_and_belongs_to_many :categories

  scope :in_categories, lambda { |categories|
    joins(:categories).
    where("categories.id in (?)", categories.collect(&:to_i))
  }
end

The :in_categories scope takes an array of Category IDs (as strings), so using this scope I can get back every project that belongs to at least one of the categories passed in.

But what I'm actua开发者_开发技巧lly trying to do is filter (a better name would be :has_categories). I want to just get the projects that belong to all of the categories passed in. So if I pass in ["1", "3", "4"] I only want to get the projects that belong to all of the categories.


There are two common solutions in SQL to do what you're describing.

Self-join:

SELECT ...
FROM Projects p
JOIN Categories c1 ON c1.project_id = p.id
JOIN Categories c3 ON c3.project_id = p.id
JOIN Categories c4 ON c4.project_id = p.id
WHERE (c1.id, c3.id, c4.id) = (1, 3, 4);

Note I'm using syntax to compare tuples. This is equivalent to:

WHERE c1.id = 1 AND c3.id = 3 AND c4.id = 4;

In general, the self-join solution has very good performance if you have a covering index. Probably Categories.(project_id,id) would be the right index, but analyze the SQL with EXPLAIN to be sure.

The disadvantage of this method is that you need four joins if you're searching for projects that match four different categories. Five joins for five categories, etc.

Group-by:

SELECT ...
FROM Projects p
JOIN Categories cc ON c.project_id = p.id
WHERE c.id IN (1, 3, 4)
GROUP BY p.id
HAVING COUNT(*) = 3;

If you're using MySQL (I assume you are), most GROUP BY queries invoke a temp table and this kills performance.

I'll leave it as an exercise for you to adapt one of these SQL solutions to equivalent Rails ActiveRecord API.


It seems like in ActiveRecord you would do it like so:

scope :has_categories, lambda { |categories|
  joins(:categories).
  where("categories.id in (?)", categories.collect(&:to_i)).
  group("projects.id HAVING COUNT(projects.id) = #{categories.count}")
}
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号