optimize this ruby code, switch arrays to sets/hash?_问答_开发者

I need to optimize this code. Any suggestions to make it go faster, please tell me. I don't have a specific amount that I want it to go faster, any suggestion would be helpful. 开发者_如何学编程In terms of complexity I want to keep it below O(n^2)

I'm wondering if trying to convert the array that I'm using into like a set or hash because that is quicker right? How much faster in terms of complexity might this allow me to run?

The main problem I think might be my use of the ruby combination function which runs pretty slow, does anyone know exactly the complexity for this ruby function? is there a faster alternative to this?

the point of this code is basically to find the single point that is the shortest combined distance from all the other points ie (the friends house that is most convenient for everyone to go to). there is a little extra code here which has some debugging/printing functions.

class Point
    attr_accessor :x, :y, :distance, :done, :count

    def initialize(x,y)
        @x = x
        @y = y
        @distance = 0
        @closestPoint = []
        @done = false
        @count = 0
    end
end

class Edge
    attr_accessor :edge1, :edge2, :weight

    def initialize(edge1,edge2,weight)
      @edge1 = edge1
      @edge2 = edge2
      @weight = weight
    end
end

class AdjacencyList
    attr_accessor :name, :minSumList, :current

    def initialize(name)
      @name = name
      @minSumList = []
      @current = nil
      @vList = []
      @edgeList = []
    end

    def addVertex(vertex)
      @vList.push(vertex)
    end

    def generateEdges2
      minSumNode = nil
      current = nil
      last = nil

      @vList.combination(2) { |vertex1, vertex2|
        distance = distance2points(vertex1,vertex2)
        edge = Edge.new(vertex1,vertex2,distance)

        if (current == nil)
          current = vertex1
          minSumNode = vertex1
        end

        vertex1.distance += distance
        vertex2.distance += distance

        vertex1.count += 1
        vertex2.count += 1

        if (vertex1.count == @vList.length-1)
          vertex1.done = true
        elsif (vertex2.count == @vList.length-1)
           vertex2.done = true
        end

        if ((vertex1.distance < minSumNode.distance) && (vertex1.done == true))            
          minSumNode = vertex1
        end

        #@edgeList.push(edge)
       }

        return minSumNode.distance
    end

    def generateEdges
      @vList.combination(2) { |vertex1, vertex2| 
        distance = distance2points(vertex1,vertex2)
        @edgeList.push(Edge.new(vertex1,vertex2,distance))
      }
    end

    def printEdges
      @edgeList.each {|edge| puts "(#{edge.edge1.x},#{edge.edge1.y}) <=> (#{edge.edge2.x},#{edge.edge2.y}) weight: #{edge.weight}"}
    end

    def printDistances
      @vList.each {|v| puts "(#{v.x},#{v.y} distance = #{v.distance})"}
    end
end

def distance2points(point1,point2)
     xdistance = (point1.x - point2.x).abs
     ydistance = (point1.y - point2.y).abs

     total_raw = xdistance + ydistance
     return totaldistance = total_raw - [xdistance,ydistance].min
end

#pointtest1 = Point.new(0,1)
#pointtest2 = Point.new(2,5)
#pointtest3 = Point.new(3,1)
#pointtest4 = Point.new(4,0)

graph = AdjacencyList.new("graph1")

gets
while (line = gets)
    graph.addVertex(Point.new(line.split[0].to_i,line.split[1].to_i))
end

#graph.addVertex(pointtest1)
#graph.addVertex(pointtest2)
#graph.addVertex(pointtest3)
#graph.addVertex(pointtest4)

puts graph.generateEdges2
#graph.printEdges
#graph.printDistances

Try to do this, and then post some more code:

ruby -rprofile your_script your_args

This will run the script under the profiler, and generate a nice table with results. If you post that here, it's more likely to get better help. Plus, you will have a more exact idea of what's consuming your CPU cycles.

Sets are basically hashes, and the advantage of hashes over arrays is O(1) find operations. Since you are simply iterating over the entire array, hashes will not offer any speed improvements if you simply replace the arrays with hashes.

Your real problem is that the running time of your algorithm is O(n^2), as in given a set of n points it will have to perform n^2 operations since you're matching every point with every other possible point.

This can be somewhat improved using hashes to cache values. For example, lets say you want the distance between point "a" and point "b". You could have a hash @distances which stores @distances["a,b"] = 52 (of course you'll have to be smart about what to use as the key). Basically just try to remove redundant operations wherever you can.

That said, the largest speed boost would be from a smarter algorithm, but I can't think of something applicable off the top of my head right now.

There's something many people know, and it won't cost you anything.

While you're trying to guess how to make the code faster, or scouring the internet for some kind of profiler, just run the program under the debugger and interrupt it while it's being slow.

Do it several times, and each time take careful note of what it's doing and why.

Here's an example in python.

The slower it is, the more obvious the problem will be.