Although some major systems like Joomla store tags as comma-separated text in the main article database, normalized system of three tables as article, tags and tag-relationship is preferred (as others like Wordpress uses). There are lots of discussions and questions about structure and reading; but I was unable to find the best INSERT command, as we need to insert into three tables. How to quickly run this process through one SQL run? Or we need to first insert article, then each tags, and finally writing the relationships?
Another question is about the uniqueness of the tags. The main advantage of this system is that we only need to store each term only once (then connecting to corresponding articles). Is it practical to use mysql UNIQUE to avoid duplication? Or (as I read somewhere) we need to read the entire list of tags as an array to find any duplication to catch the tag ID and avoid storing the term?
Will the whole process as three individual steps:
- INSERT the article
- INSERT tags with UNIQUE but regardless of their relationship
- Finding each tag ID and make a relationship to the article I开发者_Go百科D
Am I right? The reason that I asked is that I saw people catch the tags as an array and make a comparison. To me it is very slow, and kills the performance, particularly for UPDATE.
You can only ever insert in one table at a time.
One solution is to use triggers, the other is to use a transaction.
The first can be used with any engine, the latter requires InnoDB or alike engine.
Make sure you put a UNIQUE
index on the field tag.name
.
1-Using transactions
START TRANSACTION;
INSERT IGNORE INTO tag (name) VALUES ('$example1', '$example2');
INSERT INTO article (title, body) VALUES ('$title','$body');
SET @article_id = LAST_INSERT_ID();
INSERT INTO tag_link (tag_id, article_id)
SELECT t.id, @article_id FROM tag t WHERE t.name IN ('$example1','$example2');
COMMIT;
2-Using a trigger on a blackhole table
Create a table of type blackhole
with the following fields.
title: varchar(255)
body: text
tag1: varchar(50) DEFAULT NULL
tag2: varchar(50) DEFAULT NULL
...
add as many tags as you want.
Add a AFTER INSERT
trigger to the blackhole table to do the actual storage for you.
DELIMITER $$
CREATE TRIGGER ai_bh_newacticle_each AFTER INSERT ON bh_newacticle FOR EACH ROW
BEGIN
INSERT IGNORE INTO tag (name) VALUES (new.tag1, new.tag2,...,new.tag25);
INSERT INTO article (title, body) VALUES (new.title,new.body);
SET @article_id = LAST_INSERT_ID();
INSERT INTO tag_link (tag_id, article_id)
SELECT t.id, @article_id FROM tag t
WHERE t.name IN (new.tag1, new.tag2,...,new.tag25);
END$$
DELIMITER ;
DELIMITER $$
Now you can just insert the article with tags in a single statement:
INSERT INTO bh_newarticle (title, body, tag1, tag2, tag3)
VALUES ('$title','$body','$tag1','$tag2','$tag3');
Back to your question
Am I right? The reason that I asked is that I saw people catch the tags as an array and make a comparison. To me it is very slow, and kills the performance, particularly for UPDATE.
Tags are only useful if there are i a limited number of them. If you put a (unique) index on tag.name
looking for a tag will be very fast, even with 10.000 tags. This is because you are looking for an exact match. And if you are really in a hurry you can always make the tag table a memory
table with a hash index
on the name
field.
I doubt you need to worry about slowness in the tag lookup though.
Just make sure you don't allow too many tags per article. 5 seems a good start. 10 would be too many.
Links
http://dev.mysql.com/doc/refman/5.0/en/create-trigger.html
http://dev.mysql.com/doc/refman/5.0/en/blackhole-storage-engine.html
You can't insert into 3 tables in one single statement but you can run the 3 insert statements in one transaction.
I don't see an issue with declaring the tag column as unique since you want to avoid duplication. You can always check if the tag exists or not before you insert it into the table or better yet, simply upsert
the tags.
You run each INSERT by issuing a single query, there's no "workaround" nor is it even feasible for one to exists. So, 3 inserts for 3 tables.
If you need unique tags, then yes - it's good to use UNIQUE constraint to avoid duplication.
Simple INSERT IGNORE
MySQL feature should do the trick to help you avoid whether the record exists or not before inserting.
there some possibilities depending on DBMS functionality i.e: stored procedures, instead of triggers, may any other to make it possible to insert with one sql statement, but i think it does not worth it, because it not so critical even to insert in all 3 table in one transaction... it is not bad if article is saved but tags are failed to save... But, if needed, stored procedure is the best for this task, because it allows complicated logic, and you also can write subprogram in your program to exec all sql and call it when needed in 1 line...
you can create unique index on tags table on tag field and on rel table on fields (article_id, tag).
精彩评论