开发者

How to find posts with multiple tags

开发者 https://www.devze.com 2023-03-22 04:22 出处:网络
I have a very simple tag model on Rails with postgresql: class Tag < ActiveRecord::Base has_many :taggings

I have a very simple tag model on Rails with postgresql:

class Tag < ActiveRecord::Base
  has_many :taggings
  has_many :posts, :through => :taggings, 
                   :source => :tagged, :source开发者_C百科_type => 'Post'
end

class Tagging < ActiveRecord::Base
  belongs_to :tag 
  belongs_to :tagged, :polymorphic   => true  
end

class Post < ActiveRecord::Base
  has_many :taggings, :as => :tagged
  has_many :tags, :through => :taggings 
end

Is there an easy way to find all posts that have 2 more specified tags? For example lets say there are tags of "style", "men", "women", "sale". I'd like to create a generic find statement that takes in an array of tags. So if the input is ["style"] then it should return all posts with that tag (easy) or if the input is ["style", "men"] then it should return all posts with the tag "style" AND "men".


Is there an easy way to find all posts that have 2 more specified tags? For example lets say there are tags of "style", "men", "women", "sale"

The classic way is to use a pivot table : posts <-> posts_tags <-> tags

You could encode your tags like this, though, because it is the easiest way (maintains integrity, foreign keys, gives you an easy to scan list of tags, etc).

This way has decent performance for a small number of posts and a small number of tags, but is cumbersome to query (you'll need some aggregation, INTERSECT, or 1 JOIN per tag) and extremely slow if the tags are not very selective.

Obviously for the kind of searches you want to perform, this sucks. So you got 2 choices :

1- Materialize the list of tag ids of a post inside an INTEGER[] column in your posts table, put a gist (or gin) index on it, and use the "integer array is contained" operator, which is indexed, extremely fast, and trivial to query.

2- just put your tags as text and throw a full text index on them

Both are extremely fast with an advantage to the integer array.


I could write a really bad SQL here what would do JOINS and GROUP BY's but this is rails so you can do better, first your Post model should be defined like this:

class Post < ActiveRecord::Base
  has_many :taggings, :as => :tagged, :couter_cache => true
  has_many :tags, :through => :taggings 
end

And you'll need a migration to add the taggings_count column to your posts table:

add_column :posts, :taggings_count, :integer, :default => 0
add_index :posts, :taggings_count

And with that whenever a Tagging is created for Post it's going to increment the taggings_count value and you can use it in your queries to efficiently find posts with two or more tags:

Post.all( :conditions => [ 'taggings_count >= ?' 2] )
0

精彩评论

暂无评论...
验证码 换一张
取 消