I have a hash like so:
:lname => "Brown",
:email => "james@intuit.com",
:fname => "James"
:lname => nil,
:email => "brad@intuit.com",
:fname => nil
:lname => "Smith",
:email => "brad@intuit.com",
:fname => "Brad"
:lname => nil,
:email => "brad@intuit.com",
:fname => nil
:lname => "Smith",
:email => "brad@intuit.com",
:fname => "Brad"
:lname => nil,
:email =开发者_StackOverflow社区> "brad@intuit.com",
:fname => nil
What I would like to learn how to do is how to remove a record if it is duplicate. Meaning, see how there are several "brad@intuit.com" how can I remove the duplicate records, meaning remove all the others that have an email of "brad@intuit.com".... Making email the key not the other fields?
In Ruby 1.9.2, Array#uniq
will accept a block paramater which it will use when comparing your objects:
arrays.uniq { |h| h[:email] }
I know this is an old thread, but Rails has a method on 'Enumerable' called 'index_by' which can be handy in this case:
list = [
:lname => "Brown",
:email => "james@intuit.com",
:fname => "James"
:lname => nil,
:email => "brad@intuit.com",
:fname => nil
:lname => "Smith",
:email => "brad@intuit.com",
:fname => "Brad"
:lname => nil,
:email => "brad@intuit.com",
:fname => nil
:lname => "Smith",
:email => "brad@intuit.com",
:fname => "Brad"
:lname => nil,
:email => "brad@intuit.com",
:fname => nil
Now you can get the unique rows as follows:
list.index_by {|r| r[:email]}.values
To merge the rows with the same email id.
list.group_by{|r| r[:email]}.map do |k, v|
v.inject({}) { |r, h| r.merge(h){ |key, o, n| o || n } }
Custom but efficient method:
list.inject({}) do |r, h|
(r[h[:email]] ||= {}).merge!(h){ |key, old, new| old || new }
If you're putting this directly into the database, just use validates_uniqueness_of :email
in your model. See the documentation for this.
If you need to remove them from the actual hash before being used then do:
emails = [] # This is a temporary array, not your results. The results are still in my_array
my_array.delete_if do |item|
if emails.include? item[:email]
emails << item[:email]
This will merge the contents of duplicate entries
merged_list = {}
my_array.each do |item|
if merged_list.has_key? item[:email]
merged_list[item.email].merge! item
merged_list[item.email] = item
my_array = merged_list.collect { |k, v| v }
Ok, this (delete duplicates) is what you asked for:
a.sort_by { |e| e[:email] }.inject([]) { |m,e| m.last.nil? ? [e] : m.last[:email] == e[:email] ? m : m << e }
But I think this (merge values) is what you want:
a.sort_by { |e| e[:email] }.inject([]) { |m,e| m.last.nil? ? [e] : m.last[:email] == e[:email] ? (m.last.merge!(e) { |k,o,n| o || n }; m) : m << e }
Perhaps I'm stretching the one-liner idea a bit unreasonably, so with different formatting and a test case:
Aiko:so ross$ cat mergedups
require 'pp'
a = [{:fname=>"James", :lname=>"Brown", :email=>"james@intuit.com"},
{:fname=>nil, :lname=>nil, :email=>"brad@intuit.com"},
{:fname=>"Brad", :lname=>"Smith", :email=>"brad@intuit.com"},
{:fname=>nil, :lname=>nil, :email=>"brad@intuit.com"},
{:fname=>"Brad", :lname=>"Smith", :email=>"brad@intuit.com"},
{:fname=>"Brad", :lname=>"Smith", :email=>"brad@intuit.com"}]
a.sort_by { |e| e[:email] }.inject([]) do |m,e|
m.last.nil? ? [e] :
m.last[:email] == e[:email] ? (m.last.merge!(e) { |k,o,n| o || n }; m) :
m << e
Aiko:so ross$ ruby mergedups
[{:email=>"brad@intuit.com", :fname=>"Brad", :lname=>"Smith"},
{:email=>"james@intuit.com", :fname=>"James", :lname=>"Brown"}]