So, I've found several ex开发者_运维知识库amples for finding a random record in Rails 2 -- the preferred method seems to be:
Thing.find :first, :offset => rand(Thing.count)
Being something of a newbie I'm not sure how this could be constructed using the new find syntax in Rails 3.
So, what's the "Rails 3 Way" to find a random record?
Thing.first(:order => "RANDOM()") # For MySQL :order => "RAND()", - thanx, @DanSingerman
# Rails 3
Thing.order("RANDOM()").first
or
Thing.first(:offset => rand(Thing.count))
# Rails 3
Thing.offset(rand(Thing.count)).first
Actually, in Rails 3 all examples will work. But using order RANDOM
is quite slow for big tables but more sql-style
UPD. You can use the following trick on an indexed column (PostgreSQL syntax):
select *
from my_table
where id >= trunc(
random() * (select max(id) from my_table) + 1
)
order by id
limit 1;
I am working on a project (Rails 3.0.15, ruby 1.9.3-p125-perf) where the db is in localhost and users table has a bit more than 100K records.
Using
order by RAND()
is quite slow
User.order("RAND(id)").first
becomes
SELECT
users
.* FROMusers
ORDER BY RAND(id) LIMIT 1
and takes from 8 to 12 seconds to respond!!
Rails log:
User Load (11030.8ms) SELECT
users
.* FROMusers
ORDER BY RAND() LIMIT 1
from mysql's explain
+----+-------------+-------+------+---------------+------+---------+------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+--------+---------------------------------+
| 1 | SIMPLE | users | ALL | NULL | NULL | NULL | NULL | 110165 | Using temporary; Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+--------+---------------------------------+
You can see that no index is used (possible_keys = NULL), a temporary table is created and an extra pass is required to fetch the desired value (extra = Using temporary; Using filesort).
On the other hand, by splitting the query in two parts and using Ruby, we have a reasonable improvement in response time.
users = User.scoped.select(:id);nil
User.find( users.first( Random.rand( users.length )).last )
(;nil for console use)
Rails log:
User Load (25.2ms) SELECT id FROM
users
User Load (0.2ms) SELECTusers
.* FROMusers
WHEREusers
.id
= 106854 LIMIT 1
and mysql's explain proves why:
+----+-------------+-------+-------+---------------+--------------------------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+--------------------------+---------+------+--------+-------------+
| 1 | SIMPLE | users | index | NULL | index_users_on_user_type | 2 | NULL | 110165 | Using index |
+----+-------------+-------+-------+---------------+--------------------------+---------+------+--------+-------------+
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
| 1 | SIMPLE | users | const | PRIMARY | PRIMARY | 4 | const | 1 | |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
we can now use only indexes and the primary key and do the job about 500 times faster!
UPDATE:
as pointed out by icantbecool in comments the above solution has a flaw if there are deleted records in the table.
A workaround in that can be
users_count = User.count
User.scoped.limit(1).offset(rand(users_count)).first
which translates to two queries
SELECT COUNT(*) FROM `users`
SELECT `users`.* FROM `users` LIMIT 1 OFFSET 148794
and runs in about 500ms.
If using Postgres
User.limit(5).order("RANDOM()")
If using MySQL
User.limit(5).order("RAND()")
In both instances you're selecting 5 records randomly from the Users table. Here is the actual SQL query in displayed in the console.
SELECT * FROM users ORDER BY RANDOM() LIMIT 5
I made a rails 3 gem for doing this that performs better on large tables and allows you to chain relations and scopes:
https://github.com/spilliton/randumb
(edit): The default behavior of my gem basically uses the same approach as above now, but you have the option to use the old way if you want :)
Many of the answers posted actually won't perform well on rather large tables (1+ million rows). Random ordering quickly takes a few seconds, and doing a count on the table also takes quite long.
A solution that works well for me in this situation is to use RANDOM()
with a where condition:
Thing.where('RANDOM() >= 0.9').take
On a table with over a million rows, this query generally takes less than 2ms.
here we go
rails way
#in your initializer
module ActiveRecord
class Base
def self.random
if (c = count) != 0
find(:first, :offset =>rand(c))
end
end
end
end
usage
Model.random #returns single random object
or the second thought is
module ActiveRecord
class Base
def self.random
order("RAND()")
end
end
end
usage:
Model.random #returns shuffled collection
This was very useful to me however i needed a bit more flexibility, so this is what i did:
Case1: Finding one random recordsource:trevor turk site
Add this to Thing.rb model
def self.random
ids = connection.select_all("SELECT id FROM things")
find(ids[rand(ids.length)]["id"].to_i) unless ids.blank?
end
then in your controller you can call something like this
@thing = Thing.random
Case2: Finding multiple random records(no repeats)source:can't remember
I needed to find 10 random records with no repeats so this is what i found worked
In your controller:
thing_ids = Thing.find( :all, :select => 'id' ).map( &:id )
@things = Thing.find( (1..10).map { thing_ids.delete_at( thing_ids.size * rand ) } )
This will find 10 random records, however it is worth mentioning that if the database is particularly large(millions of records), this would not be ideal, and performance will be hampered. Is will perform well up to a few thousand records which was sufficient for me.
The Ruby method for randomly picking an item from a list is sample
. Wanting to create an efficient sample
for ActiveRecord, and based on the previous answers, I used:
module ActiveRecord
class Base
def self.sample
offset(rand(size)).first
end
end
end
I put this in lib/ext/sample.rb
and then load it with this in config/initializers/monkey_patches.rb
:
Dir[Rails.root.join('lib/ext/*.rb')].each { |file| require file }
Works in Rails 5 and is DB agnostic:
This in your controller:
@quotes = Quote.offset(rand(Quote.count - 3)).limit(3)
You can, of course, put this in a concern as shown here.
app/models/concerns/randomable.rb
module Randomable
extend ActiveSupport::Concern
class_methods do
def random(the_count = 1)
records = offset(rand(count - the_count)).limit(the_count)
the_count == 1 ? records.first : records
end
end
end
then...
app/models/book.rb
class Book < ActiveRecord::Base
include Randomable
end
Then you can use simply by doing:
Books.random
or
Books.random(3)
You can use sample() in ActiveRecord
E.g.
def get_random_things_for_home_page
find(:all).sample(5)
end
Source: http://thinkingeek.com/2011/07/04/easily-select-random-records-rails/
If using Oracle
User.limit(10).order("DBMS_RANDOM.VALUE")
Output
SELECT * FROM users ORDER BY DBMS_RANDOM.VALUE WHERE ROWNUM <= 10
Strongly Recommend this gem for random records, which is specially designed for table with lots of data rows:
https://github.com/haopingfan/quick_random_records
All other answers perform badly with large database, except this gem:
- quick_random_records only cost
4.6ms
totally.
- the accepted answer
User.order('RAND()').limit(10)
cost733.0ms
.
- the
offset
approach cost245.4ms
totally.
- the
User.all.sample(10)
approach cost573.4ms
.
Note: My table only has 120,000 users. The more records you have, the more enormous the difference of performance will be.
UPDATE:
Perform on table with 550,000 rows
Model.where(id: Model.pluck(:id).sample(10))
cost1384.0ms
gem: quick_random_records
only cost6.4ms
totally
Random sorting by RDBMS can be quite expensive if there are millions of records. To simplify this, you can limit the number of sorted records such way (PostgreSQL syntax):
module ActiveRecord
class Base
def self.sample
where(
"id >= TRUNC(RANDOM() * (SELECT MAX(id) FROM #{table_name}) + 1)"
).order(:id).first
end
end
end
And then User.sample
This will work more randomly in the case of ids uniform distribution
A very easy way to get multiple random records from the table. This makes 2 cheap queries.
Model.where(id: Model.pluck(:id).sample(3))
You can change the "3" to the number of random records you want.
I just ran into this issue developing a small application where I wanted to select a random question from my DB. I used:
@question1 = Question.where(:lesson_id => params[:lesson_id]).shuffle[1]
And it's working well for me. I can't speak on how the performance for larger DBs since this is just a small application.
精彩评论