I have a rake task that needs to iterate through a large number of records (called Merchants) which each have a large number of associated items. My problem is that due to Rails automatically caching the results of my DB queries, I end up putting my workers into swap space before very long.
In short, I'm wondering how to run a command like:
开发者_运维知识库Merchant.all.each { |m| items = m.items }
without caching the value of 'items' each time through.
I've tried:
Merchant.all.each do |m|
ActiveRecord::Base.connection.uncached do
items = m.items
end
end
and I've also tried adding this to my Merchant model:
def items_uncached
self.class.uncached { items }
end
and then calling items_uncached instead, but I still end up racking up the memory usage with each new set of items I access.
I'm running Rails 2.3.10, Ruby 1.9.2 and using Mysql for storage.
Thanks in advance for your thoughts!
*** edit:
HEre's the actual bit of code I'm working on:
File.open(output, "w") do |f|
Merchant.all.each do |m|
items = m.items
invalid_image_count = 0
items.each do |i|
invalid_image_count += 1 unless i.image_valid?
end
invalid_categories = items.select { |i| !i.categories_valid? }.count
f.puts "#{m.name} (#{m.id}): #{invalid_image_count} invalid images, " +
"#{invalid_categories} invalid categories"
end
end
Trying to do some error checking and then logging the results.
The query cache is not the main problem here. Rails "caches" your objects anyway.
The query cache is simply a "hash lookup" that prevents Rails from hitting the DB unnecessarily, it does not control how ruby (or Rails) stores objects internally returned by associations.
For example try this (even if uncached):
m = Merhant.first # <- m is loaded from DB
m.items # <- items are loaded from DB and STORED(!) in m
m.items # <- items are returned from the association stored in m
m.items.reload # <- hits the DB (or the query cache)
m.instance_variable_get("@items") # <- returns the actual stored items
So now when you do m.items
in your each
loop you simply populate all the Merhcant
instances with all their items, and the garbage collector is unable to free anything since all the objects are referenced from the all
array while you are inside the loop.
So the solution is to do like Victor proposes, which prevents the "association storage" from triggering.
If your association is a simple has_many
one you can try this:
Merchant.all.each do |m|
items = Item.find_all_by_merchant_id(m.id)
...
end
Or even:
Merchant.find(:all, :select => "id, name").each do |m|
items = Item.find_all_by_merchant_id(m.id)
...
end
精彩评论