Background
My program is storing a series of objects, a set of tags, and the many-to-many associations between tags and objects, in a MySQL database. To give you an idea of the structure:
CREATE TABLE objects (
object_id INT PRIMARY KEY,
...
);
CREATE TABLE tags (
tag_name VARCHAR(32) NOT NULL
);
CREATE TABLE object_tags (
object_id INT NOT NULL,
tag_name VARCHAR(32) NOT NULL,
PRIMARY KEY (object_id, tag_name)
);
Problem
I want to be able to query for all objects that are tagged with all of the tags in a given set. As an example, let's say I have a live tree, a dead flower, an orangutan, and a ship as my objects, and I want to query all of those tagged living and plant. I expect to receive a list containing only the tree, assuming the tags match the characteristics of the objects.
Current Solution
Presently, given a list of tags T1, T2, ..., Tn, I am solving the problem as follows:
- Select all
object_id
columns from theobject_tags
table wheretag_name
is T1. - Join the result of (1) with the
object_tags
table, and select allobject_id
columns wheretag_name
is T2. - Join the result of (2) with the
object_tags
table again, and select allobject_id
columns wheretag_name
is T3. - Repeat as necessary for T4, ..., Tn.
- Join t开发者_StackOverflow社区he result of (4) with the
objects
table and select the additional columns of the objects that are needed.
In practice (using Java), I start with the query string for the first tag, then prepend/append the string parts for the second tag, and so on in a loop, before finally prepending/appending the string parts that make up the overall query. Only then does the string actually get passed into a PreparedStatement
and get executed on the server.
Edit: Expanding on my example from above, using this solution I would issue the following query:
SELECT object_id FROM object_tags JOIN (
SELECT object_id FROM object_tags WHERE tag_name='living'
) AS _temp USING (object_id) WHERE tag_name='plant';
Question
Is there a better solution to this problem? Although the number of tags is not likely to be large, I am concerned about the performance of this solution, especially as the database grows in size. Furthermore, it is very difficult to read and maintain the code, especially when the additional concerns/constraints of the application are thrown in.
I am open to suggestions at any level, although the languages (MySQL and Java) are not variables at this point.
I don't know about the performance of this solution, but you can simplify by using pattern matching in MySql to match a set of pipe-delimited tags (or any delimiter). This is a solution I've used before for similar applications with tag tables (@match would be a variable passed in by your Java code, I've harded coded a value for demonstration):
set @match = 'living|plant';
set @numtags =
length(@match) - length(replace(@match, '|', '')) + 1;
select * from objects o
where @numtags =
(
select count(*) from object_tags ot
where concat('|',@match,'|')
like concat('%|',ot.tag_name,'|%')
and ot.object_id = o.object_id
)
Here is a working demo: http://sqlize.com/0vP6DgQh0j
精彩评论