I'm looking to check whether any of a list of integers fall in a list of ranges. The ranges are defined in a table defined something like:
# Extra Type Field Default Null Key
0 int(11) rangeid 0 NO PRI
1 int(11) max 0 NO MUL
2 int(11) min 0 NO MUL
Using MySQL 5.1 and Perl 5.10.
I can check whether a single value, say 7, is in any of the ranges with a statement like
SELECT 1
FROM range
WHERE 7 BETWEEN min AND max
If 7 is in any of those ranges, I get a single row back. If it isn't, no rows are returned.
Now I have a list of, say, 50 of these values, not stored in a table at present. I assemble them using map
:
my $value_list = '('
. ( join ', ', map { int $_ } @values )
. ')'
;
I want to see if any of the items in the list fall inside of any of the ranges, but am not particularly concerned with which number nor which range. I'd like to use a syntax such as:
SELECT 1
FROM range
WHERE (1, 2, 3, 4, 5, 6, 7, 42, 309, 10000) BETWEEN min AND max
MySQL kindly chastises me for such syntax:
Operand should contain 1 column(s)
I pinged #mysql who were quite helpful. However, having already written this up by the time they responded and thinking it'd be helpful to fix the answer in a more permanent med开发者_运维问答ium, I figured I'd post the question anyhow. Maybe SO will provide a different solution?
This sounded like an interesting problem. I created a test range table like so:
CREATE TABLE `test_ranges` (
`rangeid` int(11) NOT NULL,
`max` int(11) NOT NULL,
`min` int(11) NOT NULL,
PRIMARY KEY (`rangeid`),
KEY `idx_minmax` (`min`,`max`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
I inserted 50,000 rows into that table, each a range with max-min=10, like so:
mysql> select * from test_ranges limit 2;
+---------+-----+-----+
| rangeid | max | min |
+---------+-----+-----+
| 1 | 15 | 5 |
| 2 | 20 | 10 |
+---------+-----+-----+
2 rows in set (0.00 sec)
My perl code to get the ranges that match a list of integers is to create a temporary table to hold the integers, and ask MySQL to do the matching for me:
$DB->do_sql("CREATE TEMPORARY TABLE test_vals ( val int NOT NULL ) ENGINE=InnoDB");
for (12, 345, 394, 1450, 999, 9999, 99999, 999999 ) {
$DB->do_sql("INSERT INTO test_vals VALUES (?)", $_);
}
$answer = $DB->do_sql("SELECT DISTINCT * from test_vals, test_ranges WHERE val BETWEEN min AND max");
That returns me the correct list. In the mysql client that would look like:
mysql> SELECT DISTINCT * from test_vals, test_ranges WHERE val BETWEEN min AND max;
+-------+---------+--------+-------+
| val | rangeid | max | min |
+-------+---------+--------+-------+
| 12 | 1 | 15 | 5 |
| 12 | 2 | 20 | 10 |
| 345 | 67 | 345 | 335 |
| 345 | 68 | 350 | 340 |
| 345 | 69 | 355 | 345 |
| 394 | 77 | 395 | 385 |
| 394 | 78 | 400 | 390 |
| 1450 | 288 | 1450 | 1440 |
| 1450 | 289 | 1455 | 1445 |
| 1450 | 290 | 1460 | 1450 |
| 999 | 198 | 1000 | 990 |
| 999 | 199 | 1005 | 995 |
| 9999 | 1998 | 10000 | 9990 |
| 9999 | 1999 | 10005 | 9995 |
| 99999 | 19998 | 100000 | 99990 |
| 99999 | 19999 | 100005 | 99995 |
+-------+---------+--------+-------+
16 rows in set (0.00 sec)
Or, for just the list of matching values:
mysql> SELECT DISTINCT val from test_vals, test_ranges WHERE val BETWEEN min AND max;
+-------+
| val |
+-------+
| 12 |
| 345 |
| 394 |
| 999 |
| 1450 |
| 9999 |
| 99999 |
+-------+
7 rows in set (0.00 sec)
MySQL (at least 5.0, which I'm on) claims via EXPLAIN that it does not use an index for the comparison in a normal way. However, it reports "Range checked for each record" which essentially means it does what you'd think: treats the values from the test_vals
table as constants and looks them up in the test_ranges
table using the index idx_minmax
.
mysql> explain SELECT DISTINCT * from test_vals, test_ranges WHERE val BETWEEN min AND max \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: test_vals
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 8
Extra: Using temporary
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: test_ranges
type: ALL
possible_keys: idx_minmax
key: NULL
key_len: NULL
ref: NULL
rows: 48519
Extra: Range checked for each record (index map: 0x2)
2 rows in set (0.00 sec)
It's quite fast, but I don't know how many more rows you'll have than the 8 and 50K that I tested with. My guess is that creating a temporary table like this would be the optimal solution if you have more than a small handful of values you're looking up.
You can construct an SQL query in Perl that will work with multiple values as follows:
sub check_range {
'SELECT 1 FROM range WHERE ' .
join ' OR ' =>
map "($_ BETWEEN min AND max)" => @_
}
print check_range( 1, 2, 3, 4, 5, 6, 7, 42, 309, 10000 ), "\n";
> SELECT 1 FROM range WHERE (1 BETWEEN min AND max) OR (2 BETWEEN min AND max)
> OR (3 BETWEEN min AND max) OR (4 BETWEEN min AND max) ...
To be perfectly honest, if the list being checked is in single-digits size, i'd either loop through checking one-by-one in Perl (the check being your query), or if you are worried about connection/query start overhead, populate them into a temp table and loop over it in the SQL loop, pulling out 1 cvalue at a time into a variable, deleting that value from temp table and running - again - your own one-check query on that variable, inside the loop.
Here's Sybase code - hopefully it translates to MySQL easily
-- previously, CREATE TABLE #your_temp_table (num int)
CREATE TABLE #in_range (num int)
DECLARE @seven int -- This is a JOKE! NEVER use a variable name like that!!!
WHILE (exists (select 1 from #your_temp_table))
BEGIN
SELECT @seven = min(num) from #your_temp_table
DELETE #your_temp_table WHERE num = @seven
INSERT #in_range
SELECT @seven
FROM range
WHERE @seven BETWEEN min AND max
END
SELECT num from #in_range
DROP TABLE #in_range
I have a feeling this could be done a lot more elegantly but this at least works in the abscence of a better solution :)
精彩评论