I have an art site. I need to show related art pieces now. I have 3 tables:
art_info : art_id, title
art_tag_relationship : art_id, tag_id
art_tags : tag_id, tag (in text)
I'm trying to get the related art for a given piece of art sorted by most relevant tag matches.
So now lets say I have 5 art pieces with tags
art #1 tags: red, blue, green, yellow
art #2 tags: red, orange, purple, black, yellow, blue art #3 tags: red art #4 tags: blue, green art #5 tags: white, brownSo now I want to get the most related art pieces to art #1 in order of most matching to least matching. I'm looking to get results like this
Related art to #1:
- Best match = art #2 (beca开发者_Go百科use it matched 3 tags)
- 2nd match = art #4 (because it matched 2 tags)
- 3rd match = art #3 (because it matched 1 tag)
- Do not display art #5 because no matches were found.
I was thinking of using a foreach statement once I get the tags for art#1, but that seems inefficient.
Here is the query I used to get the tags for art #1
SELECT art_info.art_id, art_info.title
FROM art_info
INNER JOIN art_tag_relationship ON art_info.art_id = art_tag_relationship.art_id
WHERE art_tag_relationship.art_id = '1'
So now once I have the 4 tags from art #1, how do I get the most art that has the most related tags?
Thank you for your time and use of brain.
EDIT: Concept seems to be, get the tags for art #1 through art_tag_relationship -> art_tags, then get art_id from art_tag_relationships where tag_id's are the same from the found tags for art_id #1 in art_tag_relationships.
You can select the most relevant art piece, given another art_id.
SELECT
ai.art_id,
ai.title
count(DISTINCT r2.tag_id) as relevance
FROM art_tag_relationship r1
INNER JOIN art_tag_relationship r2 ON (r1.tag_id = r2.tag_id
AND r1.art_id <> r2.art_id)
INNER JOIN art_info ai ON (r2.art_id = ai.art_id)
WHERE r1.art_id = '1' -- this is the art_id results should be related to.
GROUP BY ai.art_id
ORDER BY relevance DESC
Given that the piece of art you are looking at is art_id = 1000
SELECT
art_info.art_id,
art_info.title
count(*) as Cnt
FROM
art_tag_relationship A1,
art_tag_relationship A2,
art_info
WHERE
A1.art_id = 1000
AND
A1.tag_id = A2.tag_id
AND
A2.art_id = art_info.art_id
GROUP BY
art_info.art_id
ORDER BY
Cnt DESC
(untested)
In theory, it will start with all of the tags of the art that you are looking at. Then it will expand to match all related tags for other art pieces. Then it will group this by the resulting art_id, and the number of records condensed into each group should equal the number of tags that matched, which is then sorted on.
Play around with the concept and update your post once you have a working query.
精彩评论