开发者

Can't use a list of methods in a Python class, it breaks deepcopy. Workaround?

开发者 https://www.devze.com 2023-02-23 04:36 出处:网络
I\'m trying to learn more about Python by implementing a k-Nearest Neighbor classifier.KNN works by labeling the new data based on what existing data its most similar to. So for a given table of data,

I'm trying to learn more about Python by implementing a k-Nearest Neighbor classifier. KNN works by labeling the new data based on what existing data its most similar to. So for a given table of data, you try to determine the 3 most similar points (if k = 3) and pick whatever label is more frequent. There's different ways you determine "similarity", some kind of distance function. So you can implement various distance functions (cosine distance, manhattan, euclidean, etc) and pick whichever you want.

I'm trying to make something that lets me swap distance functions in and out easily without doing cases, and the solution I have so far is to just store a list of references to methods. This is great, but it breaks on deepcopy and I want to figure out how to either fix my implementation or come up with a compromise between not needing to do cases and getting deepcopy to work.

Here's my paired down class

class DataTable:

    def __init__(self, filename,TrueSymbol,FalseSymbol):
        self.t = self.parseCSV(filename)
        self.TrueSymbol = TrueSymbol
        self.FalseSymbol = FalseSymbol
        # This is the problem line of code
        self.distList = [self.euclideanDistance,
                            self.manhattanDistance,
                            self.cosineDistance]

    def nearestNeighbors(self,entry,k=None,distanceMetric=None):
        """
        distanceMetrics you can choose from:
        0 = euclideanDistance
        1 = manhattanDistance
        2 = cosineDistance
        """
        if distanceMetric == None:
            distanceFunction = self.euclideanDistance
        else:
            self.distList[distanceMetric]
        # etc..

    def euclideanDistance(self,entry):
        pass
    def manhattanDistance(self,entry):
        pass
    def cosineDistance(self,entry):
        pass

    # open up that csv
    def parseCSV(self,filename):
        pass

And here's the code that calls it import deepcopytestDS import copy

data = deepcopytestDS.DataTable("ionosphere.data","g","b")
deepCopy = copy.deepcopy(data) # crash.

Here's the callstack

>>> deepCopy = copy.deepcopy(data)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/copy.py", line 162, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python2.6/copy.py", line 292, in _deepcopy_inst
    state = deepcopy(state, memo)
  File "/usr/lib/python2.6/copy.py", line 162, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python2.6/copy.py", line 255, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python2.6/copy.py", line 162, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python2.6/copy.py", line 228, in _deepcopy_list
    y.append(deepcopy(a, memo))
  File "/usr/lib/python2.6/copy.py", line 189, in deepcopy
    y = _reconstruct(x, rv, 1, memo)
  File "/usr/lib/python2.6/copy开发者_如何学编程.py", line 323, in _reconstruct
    y = callable(*args)
  File "/usr/lib/python2.6/copy_reg.py", line 93, in __newobj__
    return cls.__new__(cls, *args)
TypeError: instancemethod expected at least 2 arguments, got 0

What does this crash mean, and is there some way to make deepcopy work without getting rid of my shortcut for swapping distance functions?


It was a bug: http://bugs.python.org/issue1515

You can put this at the top of your file to make it work:

import copy
import types

def _deepcopy_method(x, memo):
    return type(x)(x.im_func, copy.deepcopy(x.im_self, memo), x.im_class)
copy._deepcopy_dispatch[types.MethodType] = _deepcopy_method
0

精彩评论

暂无评论...
验证码 换一张
取 消