I'm working on a pair comparison site where a user loads a list of films and grades from another site. My site then picks two random movies and matches them against each other, the user selects the better of the two and a new pair is loaded. This gives a complete list of movies ordered by whichever is best.
The database contains three tables;
fm_film_data - this contains all imported movies
fm_film_data(id int(11),
imdb_id varchar(10),
tmdb_id varchar(10),
title varchar(255),
original_title varchar(255),
year year(4),
director text,
description text,
poster_url varchar(255))
fm_films - this contains all information related to a user, what movies the user has seen, what grades the user has given, as well as information about each film's wins/losses for that user.
fm_films(id int(11),
user_id int(11),
film_id int(11),
grade int(11),
wins int(11),
losses int(11))
fm_log - this contains records of every duel that has occurred.
fm_log(id int(11),
user_id int(11),
winner int(11),
loser int(11))
To pick a pair to show the user, I've created a mySQL query that checks the log and picks a pair at random.
SELECT pair.id1, pair.id2
FROM
(SELECT part1.id AS id1, part2.id AS id2
FROM fm_films AS part1, fm_films AS part2
WHERE part1.id <> part2.id
AND part1.user_id = [!!USERID!!]
AND part2.user_id = [!!USERID!!])
AS pair
LEFT JOIN
(SELECT winner AS id1, loser AS id2
FROM fm_log
WHERE fm_log.user_id = [!!USERID!!]
UNION
SELECT loser AS id1, winner AS id2
FROM fm_log
WHERE fm_log.user_id = [!!USERID!!])
AS log
ON pair.id1 = log.id1 AND pair.id2 = log.id2
WHERE log.id1 IS NULL
ORDER BY RAND()
LIMIT 1
T开发者_运维百科his query takes some time to load, about 6 seconds in our tests with two users with about 800 grades each.
I'm looking for a way to optimize this but still limit all duels to appear only once.
The server runs MySQL version 5.0.90-community.
i think you are better off creating a stored procedure/function which will return a pair as soon as it found a valid one.
make sure there are proper indexes:
fm_films.user_id
(try including thefilm_id
also)fm_log.user_id
(try including thewinner
andloser
)
DELIMITER $$
DROP PROCEDURE IF EXISTS spu_findPair$$
CREATE PROCEDURE spu_findPair
(
IN vUserID INT
)
BEGIN
DECLARE done BOOLEAN DEFAULT FALSE;
DECLARE vLastFilmID INT;
DECLARE vCurFilmID INT;
DECLARE cUserFilms CURSOR FOR
SELECT id
FROM fm_films
WHERE user_id = vUserID
ORDER BY RAND();
DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET done=TRUE;
OPEN cUserFilms;
ufLoop: LOOP
FETCH cUserFilms INTO vCurFilmID;
IF done THEN
CLOSE cUserFilms;
LEAVE ufLoop;
END IF;
IF vLastFilmID IS NOT NULL THEN
IF NOT EXISTS
(
SELECT 1
FROM fm_log
WHERE user_id = vUserID
AND ((winner = vCurFilmID AND loser = vLastFilmID) OR (winner = vLastFilmID AND loser = vCurFilmID))
) THEN
CLOSE cUserFilms;
LEAVE ufLoop;
#output
SELECT vLastFilmID, vCurFilmID;
END IF;
END IF;
END LOOP;
END$$
DELIMITER ;
Have you tried applying any indexes to the tables?
The user_id columns would be a good start. The id field that is also used in the WHERE clause would be another index that might be worth adding. Benchmakr to make sure the addition of the indices do result in speedups and do not slow other code (eg. insertions).
However, I have found that simple indexes on short tables like these can still result in some huge speed ups when they apply to fields in the WHERE clauses of SELECT and UPDATE statements.
精彩评论