I have a table that looks like this:
CREATE TEMPORARY TABLE MainList (
`pTime` int(10) unsigned NOT NULL,
`STD` double NOT NULL,
PRIMARY KEY (`pTime`)
) ENGINE=MEMORY;
+------------+-------------+
| pTime | STD |
+------------+-------------+
| 1106080500 | -0.5058072 |
| 1106081100 | -0.82790455 |
| 1106081400 | -0.59226294 |
| 1106081700 | -0.99998194 |
| 1106540100 | -0.86649279 |
| 1107194700 | 1.51340543 |
| 1107305700 | 0.96225296 |
| 1107306300 | 0.53937716 |
+------------+-------------+ .. etc
pTime is my primary key.
I want to make a query that, for every row in my table, will find the first pTime where STD has a flipped sign and is further away from 0 than STD of the above table. (For simplicity's sake, just imagine that I am looking for 0-STD)
Here is an example of the output I want:
+------------+-------------+------------+-------------+
| pTime | STD | pTime_Oppo | STD_Oppo |
+------------+-------------+------------+-------------+
| 1106080500 | -0.5058072 | 1106090400 | 0.57510881 |
| 1106081100 | -0.82790455 | 1106091300 | 0.85599817 |
| 1106081400 | -0.59226294 | 1106091300 | 0.85599817 |
| 1106081700 | -0.99998194 | 1106091600 | 1.0660959 |
+------------+-------------+------------+-------------+
I can't seem to get it right! I tried the following:
SELECT DISTINCT
MainList.pTime,
MainList.STD,
b34d1.pTime,
b34d1.STD
FROM
MainList
JOIN b34d1 ON(
b34d1.pTime > MainList.pTime
AND(
(
MainList.STD > 0
AND b34d1.STD <= 0 - MainList.STD
)
OR(
MainList.STD < 0
AND b34d1.STD >= 0 - MainList.STD
)
)
);
That code just freezes my server up.
P.S Table b34d1 is just like MainList, except it contains much more elements:
mysql> select STD, Slope from b31d1 limit 10;
+-------------+--------------+
| STD | Slope |
+-------------+--------------+
| -0.44922675 | -5.2016129 |
| -0.11892021 | -8.15249267 |
| 0.62574686 | -10.19794721 |
| 1.10469057 | -12.43768328 |
| 1.52917352 | -13.08651026 |
| 1.61803899 | -13.2441349 |
| 1.82686555 | -12.04912023 |
| 2.07480736 | -11.22067449 |
| 2.45529961 | -7.84090909 |
| 1.86468335 | -6.26466276 |
+-------------+--------------+
mysql> select count(*) from b31d1;
+----------+
| count(*) |
+----------+
| 439340 |
+----------+
1 row in set (0.00 sec)
In fact MainList is just a filtered version of b34d1 that uses the MEMORY engine
mysql> show create table b34d1;
+-------+-----------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------+
| Table | Create Table
|
+-------+-----------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------+
| b34d1 | CREATE TABLE `b34d1` (
`pTime` int(10) unsigned NOT NULL,
`Slope` double NOT NULL,
`STD` double NOT NULL,
PRIMARY KEY (`pTime`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 MIN_ROWS=339331 MAX_ROWS=539331 PACK_KEYS=1 ROW_FORMAT=FIXED |
+-------+-----------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------+
Edit: I just did a little experiment and I am very confused by the results:
SELECT DISTINCT
b34d1.pTime,
b34d1.STD,
Anti.p开发者_如何学CTime,
Anti.STD
FROM
b34d1
LEFT JOIN b34d1 As Anti ON(
Anti.pTime > b34d1.pTime
AND(
(
b34d1.STD > 0
AND b34d1.STD <= 0 - Anti.STD
)
OR(
b34d1.STD < 0
AND b34d1.STD >= 0 - Anti.STD
)
)
) limit 10;
+------------+-------------+------------+------------+
| pTime | STD | pTime | STD |
+------------+-------------+------------+------------+
| 1104537600 | -0.70381962 | 1104539100 | 0.73473692 |
| 1104537600 | -0.70381962 | 1104714000 | 1.46733274 |
| 1104537600 | -0.70381962 | 1104714300 | 2.02097356 |
| 1104537600 | -0.70381962 | 1104714600 | 2.60642099 |
| 1104537600 | -0.70381962 | 1104714900 | 2.01006557 |
| 1104537600 | -0.70381962 | 1104715200 | 1.97724189 |
| 1104537600 | -0.70381962 | 1104715500 | 1.85683704 |
| 1104537600 | -0.70381962 | 1104715800 | 1.2754127 |
| 1104537600 | -0.70381962 | 1104716100 | 0.87900156 |
| 1104537600 | -0.70381962 | 1104716400 | 0.72957739 |
+------------+-------------+------------+------------+
Why are all the values under the first pTime the same?
Selecting other fields from a row having some aggregate statistic (such as a minimum or maximum value) is a little messy in SQL. Such queries aren't so simple. You typically need an extra join or a subquery. For example:
SELECT m.pTime, m.STD, m2.pTime AS pTime_Oppo, m2.STD AS STD_Oppo
FROM MainList AS m
JOIN
(SELECT m1.pTime, MIN(m2.pTime) AS pTime_Oppo
FROM MainList AS m1
JOIN MainList AS m2
ON m1.pTime < m2.pTime AND SIGN(m1.STD) != SIGN(m2.STD)
WHERE ABS(m1.STD) <= ABS(m2.std)
GROUP BY m1.pTime
) AS oppo ON m.pTime = oppo.pTime
JOIN MainList AS m2 ON oppo.pTime_Oppo = m2.pTime
;
Using the sample data:
INSERT INTO MainList (`pTime`, `STD`)
VALUES
(1106080500, -0.5058072),
(1106081100, -0.82790455),
(1106081400, -0.59226294),
(1106081700, -0.99998194),
(1106090400, 0.57510881),
(1106091300, 0.85599817),
(1106091600, 1.0660959),
(1106540100, -0.86649279),
(1107194700, 1.51340543),
(1107305700, 0.96225296),
(1107306300, 0.53937716),
;
The results are:
+------------+-------------+------------+-------------+ | pTime | STD | pTime_Oppo | STD_Oppo | +------------+-------------+------------+-------------+ | 1106080500 | -0.5058072 | 1106090400 | 0.57510881 | | 1106081100 | -0.82790455 | 1106091300 | 0.85599817 | | 1106081400 | -0.59226294 | 1106091300 | 0.85599817 | | 1106081700 | -0.99998194 | 1106091600 | 1.0660959 | | 1106090400 | 0.57510881 | 1106540100 | -0.86649279 | | 1106091300 | 0.85599817 | 1106540100 | -0.86649279 | | 1106540100 | -0.86649279 | 1107194700 | 1.51340543 | +------------+-------------+------------+-------------+
Any solution based on functions like ABS or SIGN or anything similar required to check sign is doomed to be ineffective on big sets of data, because it makes indexing impossible.
You are creating a temporary table inside a SP so you can alter it schema without losing anything, adding a column that stores sign of STD and storing STD itself unsigned will give you HUGE performance boost, because you can simply find first bigger pTime and bigger STD with a different sign and all conditions can use indices in a query like this (STD_positive keeps STD's sign):
SELECT * from mainlist m
LEFT JOIN mainlist mu
ON mu.pTime = ( SELECT md.pTime FROM mainlist md
WHERE m.pTime < md.pTime
AND m.STD < md.STD
AND m.STD_positive <> md.STD_positive
ORDER BY md.pTime
LIMIT 1 )
LEFT JOIN is needed here to return rows that dont have bigger STD. If you don't need them use simple JOIN. This query should run fine even on lots of records, with proper indices based on careful checking of EXPLAIN output, starting with an index on STD.
SELECT
m.pTime,
m.STD,
mo.pTime AS pTime_Oppo,
-mo.STD AS STD_Oppo
FROM MainList m
INNER JOIN (
SELECT
pTime,
-STD AS STD
FROM MainList
) mo ON m.STD > 0 AND mo.STD > m.STD
OR m.STD < 0 AND mo.STD < m.STD
LEFT JOIN (
SELECT
pTime,
-STD AS STD
FROM MainList
) mo2 ON mo.STD > 0 AND mo2.STD > m.STD AND mo.STD > mo2.STD
OR mo.STD < 0 AND mo2.STD < m.STD AND mo.STD < mo2.STD
WHERE mo2.pTime IS NULL
精彩评论