I have a challenge which I can't find answers to through any normal means. I want to do as much as possible in a single query because my only other alternative is to script this function and foreach loop it with countless sub-queries.
Ok, so I have a logging table on an MySQL server, it has the following columns: Key, UserID, datetime and position Position can be "start", "login" and "end".
I want to find out the difference between "start", "login" and "end" for each login event. Anyone have an idea of an SQL query I could use? I think it has something to do with a query within a query but I can only find JOINs?
I postulate that I would first SELECT the "start" events and then for each "start" event it should look for the next first login event for the "UserID" (after the time of the "start" but less than the time of the next "start" for the same "UserID") then DATE_SUB() to find out the difference. It would be important to filter out "login" or "end" events that happen after a future "start" event in case开发者_如何学Python either didn't happen.
I would be very grateful if anyone could save me doing this in Excel because then I can avoid feeling too dirty with thousands of vlookups. I'm contemplating doing this in PHP or Perl and just creating a loop with LIMIT 1 SQL statements, but that could create thousands of SQL queries compared to a single decent SQL SELECT statement.
I'm not sure about MySQL. In MS Sql Server I'd try something like the following:
SELECT Key, UserID, DateTime, Position,
(DateTime - (SELECT MAX(DateTime) FROM <table> t1
WHERE t1.UserID=UserID AND t1.DateTime < DateTime)) AS DiffToLastEvent
FROM <table>
I have not actually tried this, but that's what I'd start with. This should select all columns from table <table>
with an additional field that contains the time difference of the current item to the youngest item that was created for the user before the current item.
Here's the basic approach for if you just have start and end dates. I don't quite understand how the "login" position fits in, so you'll need to explain more clearly what output you want, or fiddle the query to your satisfaction. Also, the query is in SQL server as I don't have MySQL installed - sorry!
CREATE TABLE LogTable
(
id int IDENTITY,
userId nvarchar(20),
logDate datetime,
position nvarchar(5)
)
GO
INSERT INTO LogTable (userId, logDate, position) VALUES ('User1', '2001-01-01', 'start')
INSERT INTO LogTable (userId, logDate, position) VALUES ('User1', '2001-01-02', 'end')
INSERT INTO LogTable (userId, logDate, position) VALUES ('User1', '2001-01-05', 'start')
INSERT INTO LogTable (userId, logDate, position) VALUES ('User1', '2001-01-08', 'end')
INSERT INTO LogTable (userId, logDate, position) VALUES ('User2', '2001-01-01', 'start')
INSERT INTO LogTable (userId, logDate, position) VALUES ('User2', '2001-01-03', 'end')
INSERT INTO LogTable (userId, logDate, position) VALUES ('User2', '2001-01-06', 'start')
INSERT INTO LogTable (userId, logDate, position) VALUES ('User2', '2001-01-07', 'end')
GO
SELECT log1.userId, log1.logDate AS [start], log2.logDate AS [end],
DATEDIFF(d, log1.logDate, log2.logDate) AS [Diff. in Days]
FROM LogTable log1 JOIN
LogTable log2 ON log1.userId = log2.userId
WHERE log1.position = 'start'
AND log2.position = 'end'
AND log2.logDate > log1.logDate
AND NOT EXISTS (
SELECT *
FROM LogTable logDateCheck
WHERE logDateCheck.userId = log1.userId
AND logDateCheck.logDate >= log1.logDate
AND logDateCheck.logDate <= log2.logDate
AND logDateCheck.id NOT IN (log1.id, log2.id))
GO
Output is:
精彩评论