I am trying to loop through a title string with an array of strings and see which ones from the array match. 开发者_运维技巧
My code works fine but I am not sure if it is the most efficient way to do this.
The important thing is that the strings in the array do not have to match a phrase in the title exactly. They can be in any order as long as every word is in the title. Any help would be great.
EX.title = "Apple Iphone 4 Verizon"
array = ["iphone apple, verizon iphone", "iphone 3g", "iphone 4", "cool iphone"]
I need it to return ["iphone apple", "verizon iphone", "iphone 4"]
. The words in the strings "verizon iphone" and "iphone apple" are in the title, the order does not matter
results = []
#Loop through all the pids to see if they are found in the title
all_pids = ["iphone 3gs", "iphone white 4", "iphone verizon", "black iphone", "at&t iphone"]
title = "Apple Iphone 4 White Verizon"
all_pids.each do |pid|
match = []
split_id = pid.downcase.split(' ')
split_id.each do |name|
in_title = title.downcase.include?(name)
if in_title == true
match << name
end
end
final = match.join(" ")
if final.strip == pid.strip
results << pid
end
end
print results
When I run this it prints what I need ["iphone white 4", "iphone verizon"]
You could do something like the following:
>> require 'set'
=> true
>> title = "Apple Iphone 4 Verizon"
=> "Apple Iphone 4 Verizon"
>> all_pids = ["iphone apple", "verizon iphone", "iphone 3g", "iphone 4", "cool iphone"]
=> ["iphone apple", "verizon iphone", "iphone 3g", "iphone 4", "cool iphone"]
>> title_set = Set.new(title.downcase.split)
=> #<Set: {"apple", "iphone", "4", "verizon"}>
>> all_pids.select { |pid| Set.new(pid.downcase.split).subset? title_set }
=> ["iphone apple", "verizon iphone", "iphone 4"]
You can do something very similar with array differences, but sets might be faster since they are implemented as hashes.
It looks to me that you want to find the strings that are composed of strings that strictly intersect the strings in the title.
Array#-
performs set difference operations. [2] - [1,2,3] = []
and [1,2,3] - [2] = [1,3]
title = "Apple Iphone 4 White Verizon"
all_pids = ["iphone 3gs", "iphone white 4", "iphone verizon", "black iphone", "at&t iphone"]
set_of_strings_in_title = title.downcase.split
all_pids.find_all do |pid|
set_of_strings_not_in_title = pid.downcase.split - set_of_strings_in_title
set_of_strings_not_in_title.empty?
end
EDIT: Changed #find to #find_all to return all matches, not just the first.
精彩评论