Looking for something that can go through the relationships defined in models and can check the DB for orphaned records/broken links 开发者_如何学JAVAbetween tables.
(for the latest version of the script below, see https://gist.github.com/KieranP/3849777)
The problem with Martin's script is that it uses ActiveRecord to first pull records, then find the associations, then fetch the associations. It generates a ton of SQL calls for each of the associations. It's not bad for a small app, but when you have a multiple tables with 100k records and each with 5+ belongs_to, it can take well into the 10+ minute mark to complete.
The following script uses SQL instead, looks for orphaned belongs_to associations for all models in app/models within a Rails app. It handles simple belongs_to, belongs_to using :class_name, and polymorphic belongs_to calls. On the production data I was using, it dropped the runtime of a slightly modified version of Martin's script from 9 minutes to just 8 seconds, and it found all the same issues as before.
Enjoy :-)
task :orphaned_check => :environment do
Dir[Rails.root.join('app/models/*.rb').to_s].each do |filename|
klass = File.basename(filename, '.rb').camelize.constantize
next unless klass.ancestors.include?(ActiveRecord::Base)
orphanes = Hash.new
klass.reflect_on_all_associations(:belongs_to).each do |belongs_to|
assoc_name, field_name = belongs_to.name.to_s, belongs_to.foreign_key.to_s
if belongs_to.options[:polymorphic]
foreign_type_field = field_name.gsub('_id', '_type')
foreign_types = klass.unscoped.select("DISTINCT(#{foreign_type_field})")
foreign_types = foreign_types.collect { |r| r.send(foreign_type_field) }
foreign_types.sort.each do |foreign_type|
related_sql = foreign_type.constantize.unscoped.select(:id).to_sql
finder = klass.unscoped.select(:id).where("#{foreign_type_field} = '#{foreign_type}'")
finder.where("#{field_name} NOT IN (#{related_sql})").each do |orphane|
orphanes[orphane] ||= Array.new
orphanes[orphane] << [assoc_name, field_name]
end
end
else
class_name = (belongs_to.options[:class_name] || assoc_name).classify
related_sql = class_name.constantize.unscoped.select(:id).to_sql
finder = klass.unscoped.select(:id)
finder.where("#{field_name} NOT IN (#{related_sql})").each do |orphane|
orphanes[orphane] ||= Array.new
orphanes[orphane] << [assoc_name, field_name]
end
end
end
orphanes.sort_by { |record, data| record.id }.each do |record, data|
data.sort_by(&:first).each do |assoc_name, field_name|
puts "#{record.class.name}##{record.id} #{field_name} is present, but #{assoc_name} doesn't exist"
end
end
end
end
This might depend on what action you want to take with the orphans. Perhaps you just want to delete them? That would be easily solved with a couple of SQL queries.
Had the same task and with the current finders ended along the lines of:
Product.where.not(category_id: Category.pluck("id")).delete_all
to get rid of all Products, which have lost their Category meanwhile.
You can create a Rake task to search for and handle orphaned records, for example:
namespace :db do
desc "Handle orphans"
task :handle_orphans => :environment do
Dir[Rails.root + "app/models/**/*.rb"].each do |path|
require path
end
ActiveRecord::Base.send(:descendants).each do |model|
model.reflections.each do |association_name, reflection|
if reflection.macro == :belongs_to
model.all.each do |model_instance|
unless model_instance.send(reflection.primary_key_name).blank?
if model_instance.send(association_name).nil?
print "#{model.name} with id #{model_instance.id} has an invalid reference, would you like to handle it? [y/n]: "
case STDIN.gets.strip
when "y", "Y"
# handle it
end
end
end
end
end
end
end
end
end
Let’s say you have an application where a User can subscribe to a Magazine. With ActiveRecord associations, it would look something like this:
# app/models/subscription.rb
class Subscription < ActiveRecord::Base
belongs_to :magazine
belongs_to :user
end
# app/models/user.rb
class User < ActiveRecord::Base
has_many :subscriptions
has_many :users, through: :subscriptions
end
# app/models/magazine.rb
class Magazine < ActiveRecord::Base
has_many :subscriptions
has_many :users, through: :subscriptions
end
Unfortunately, someone forgot to add dependent: :destroy to the has_many :subscriptions. When a user or magazine was deleted, an orphaned subscription was left behind.
This issue was fixed by dependent: :destroy, but there was still a large number of orphaned records lingering around. There are two ways you can use to remove the orphaned records.
Approach 1 — Bad Smell
Subscription.find_each do |subscription|
if subscription.magazine.nil? || subscription.user.nil?
subscription.destroy
end
end
This executes a separate SQL query for each record, checks whether it is orphaned, and destroys it if it is.
Approach 2 — Good Smell
Subscription.where([
"user_id NOT IN (?) OR magazine_id NOT IN (?)",
User.pluck("id"),
Magazine.pluck("id")
]).destroy_all
This approach first gets the IDs of all Users and Magazines, and then executes one query to find all Subscriptions that don’t belong to either a User or a Query.
KieranP's answer was a big help for me but his script does not handle namespaced classes. I added a few lines to do so, whilst ignoring the concerns directory. I also added an optional DELETE=true command line arg if you want to nuke all orphaned records.
namespace :db do
desc "Find orphaned records. Set DELETE=true to delete any discovered orphans."
task :find_orphans => :environment do
found = false
model_base = Rails.root.join('app/models')
Dir[model_base.join('**/*.rb').to_s].each do |filename|
# get namespaces based on dir name
namespaces = (File.dirname(filename)[model_base.to_s.size+1..-1] || '').split('/').map{|d| d.camelize}.join('::')
# skip concerns folder
next if namespaces == "Concerns"
# get class name based on filename and namespaces
class_name = File.basename(filename, '.rb').camelize
klass = "#{namespaces}::#{class_name}".constantize
next unless klass.ancestors.include?(ActiveRecord::Base)
orphans = Hash.new
klass.reflect_on_all_associations(:belongs_to).each do |belongs_to|
assoc_name, field_name = belongs_to.name.to_s, belongs_to.foreign_key.to_s
if belongs_to.options[:polymorphic]
foreign_type_field = field_name.gsub('_id', '_type')
foreign_types = klass.unscoped.select("DISTINCT(#{foreign_type_field})")
foreign_types = foreign_types.collect { |r| r.send(foreign_type_field) }
foreign_types.sort.each do |foreign_type|
related_sql = foreign_type.constantize.unscoped.select(:id).to_sql
finder = klass.unscoped.where("#{foreign_type_field} = '#{foreign_type}'")
finder.where("#{field_name} NOT IN (#{related_sql})").each do |orphan|
orphans[orphan] ||= Array.new
orphans[orphan] << [assoc_name, field_name]
end
end
else
class_name = (belongs_to.options[:class_name] || assoc_name).classify
related_sql = class_name.constantize.unscoped.select(:id).to_sql
finder = klass.unscoped
finder.where("#{field_name} NOT IN (#{related_sql})").each do |orphan|
orphans[orphan] ||= Array.new
orphans[orphan] << [assoc_name, field_name]
end
end
end
orphans.sort_by { |record, data| record.id }.each do |record, data|
found = true
data.sort_by(&:first).each do |assoc_name, field_name|
puts "#{record.class.name}##{record.id} #{field_name} is present, but #{assoc_name} doesn't exist" + (ENV['DELETE'] ? ' -- deleting' : '')
record.delete if ENV['DELETE']
end
end
end
puts "No orphans found" unless found
end
end
I have created a gem called OrphanRecords. It provides rake tasks for show/delete the orphan records. Currently it is not supporting HABTM association, if you are interested please feel free to contribute :)
I've written a method to do just this in my gem PolyBelongsTo
You can find all orphaned records by calling the pbt_orphans method on any ActiveRecord model.
Gemfile
gem 'poly_belongs_to'
Code example
User.pbt_orphans
# => #<ActiveRecord::Relation []> # nil for objects without belongs_to
Story.pbt_orphans
# => #<ActiveRecord::Relation []> # nil for objects without belongs_to
All orphaned records are returned.
If you just want to check if a single record is orphaned you can do it with the :orphan? method.
User.first.orphan?
Story.find(5).orphan?
Works for both polymorphic relations and non-polymorphic relations.
As a bonus if you want to find polymorphic records with invalid types you can do the following:
Story.pbt_mistyped
Returns an Array of records of invalid ActiveRecord model names used in your Story records. Records with types like ["Object", "Class", "Storyable"].
精彩评论