开发者

Prevent Rails from caching results of ActiveRecord query

开发者 https://www.devze.com 2023-03-26 05:46 出处:网络
I have a rake task that needs to iterate through a large number of records (called Merchants) which each have a large number of associated items.My problem is that due to Rails automatically caching t

I have a rake task that needs to iterate through a large number of records (called Merchants) which each have a large number of associated items. My problem is that due to Rails automatically caching the results of my DB queries, I end up putting my workers into swap space before very long.

In short, I'm wondering how to run a command like:

开发者_运维知识库Merchant.all.each { |m| items = m.items }

without caching the value of 'items' each time through.

I've tried:

Merchant.all.each do |m|
  ActiveRecord::Base.connection.uncached do
   items = m.items
 end
end

and I've also tried adding this to my Merchant model:

def items_uncached
  self.class.uncached { items }
end

and then calling items_uncached instead, but I still end up racking up the memory usage with each new set of items I access.

I'm running Rails 2.3.10, Ruby 1.9.2 and using Mysql for storage.

Thanks in advance for your thoughts!

*** edit:

HEre's the actual bit of code I'm working on:

File.open(output, "w") do |f|
  Merchant.all.each do |m|
    items = m.items
    invalid_image_count = 0
    items.each do |i|
      invalid_image_count += 1 unless i.image_valid?
    end
    invalid_categories = items.select { |i| !i.categories_valid? }.count
    f.puts "#{m.name} (#{m.id}): #{invalid_image_count} invalid images, " +
            "#{invalid_categories} invalid categories"
  end
end

Trying to do some error checking and then logging the results.


The query cache is not the main problem here. Rails "caches" your objects anyway.

The query cache is simply a "hash lookup" that prevents Rails from hitting the DB unnecessarily, it does not control how ruby (or Rails) stores objects internally returned by associations.

For example try this (even if uncached):

m = Merhant.first # <- m is loaded from DB
m.items           # <- items are loaded from DB and STORED(!) in m
m.items           # <- items are returned from the association stored in m
m.items.reload    # <- hits the DB (or the query cache)
m.instance_variable_get("@items") # <- returns the actual stored items

So now when you do m.items in your each loop you simply populate all the Merhcant instances with all their items, and the garbage collector is unable to free anything since all the objects are referenced from the all array while you are inside the loop.

So the solution is to do like Victor proposes, which prevents the "association storage" from triggering.


If your association is a simple has_many one you can try this:

Merchant.all.each do |m| 
  items = Item.find_all_by_merchant_id(m.id) 
  ...
end 

Or even:

Merchant.find(:all, :select => "id, name").each do |m| 
  items = Item.find_all_by_merchant_id(m.id) 
  ... 
end
0

精彩评论

暂无评论...
验证码 换一张
取 消