So I have a table that looks like this:
A B
A C
B A
C A
C B
I want to delete the lines that the connection of two values are already in represented (so A----B is the equivalent connection as B----A). Basically I want my table to look like this.
A B
A C
B C
How can I do this in Ruby?
-Bobby
EDIT:
Here is my current code:
require 'rubygems'
f = File.new("uniquename.txt","w")
i = IO.readlines('bioportnetwork.txt').collect{|l| l.split.sort}.uniq
i.each do |z|
f.write(z + "\n")
end
I tried this code, but I think the IO.readlines did not read my columns correctly. Here is one part of my table.
9722,97开发者_StackOverflow中文版54 8755
8755 9722,9754
9722,9754 7970,7971
7970,7971 9722,9754
How can I get it read correctly, then saved out correctly as a TSV file?
-Bobby
So, let's say you have loaded your TSV file into an array of pairs:
arr = [["A", "B"], ["A", "C"], ["B", "A"], ["C", "A"], ["C", "B"]]
Hash[arr.map{|pair| [pair.sort, pair]}].values
#=> [["B", "A"], ["C", "A"], ["C", "B"]]
This is OK if the order of pairs in original array is not important.
And if neither order of elements in pairs is important:
arr.map(&:sort).uniq
#=> [["A", "B"], ["A", "C"], ["B", "C"]]
I'm assuming by 'table' you mean an array-of-arrays similar to this:
x = [['A', 'B'],
['A', 'C'],
['B', 'A'],
['C', 'A'],
['C', 'B']]
If so, you can de-duplicate the list with x.collect{|a| a.sort}.uniq
.
Update: To read the data out of the file and into the array, use something like:
lines = IO.readlines('filename.txt')
x = []
lines.each {|l| x << l.split}
Update 2: Or, you can one-line the whole thing:
IO.readlines('test.txt').collect{|l| l.split.sort}.uniq
Update 3:
When writing out to the file, don't use IO.write
. It converts the array to a string automatically, which might be where you are running into your problem. Instead, use IO.puts
:
f.puts x[0].to_s << "\t" << x[1].to_s
Set equivalency is defined in ruby, and Sets use equivalency only to check new members, so you can use a nested set structure to solve this quickly and easily.
set_of_all_sets = Set.new
file.each_line do |line|
line =~ /(\S)\s(\S)/
set_of_all_sets << Set.new([$1, $2])
end
array_of_all_sets.map{|set| set.to_a}
精彩评论