Trouble with copying dictionaries and using deepcopy on an SQLAlchemy ORM object_问答_开发者

I'm doing a Simulated Annealing algorithm to optimise a given allocation of students and projects.

This is language-agnostic pseudocode from Wikipedia:

s ← s0; e ← E(s)                                // Initial state, energy.
sbest ← s; ebest ← e                            // Initial "best" solution
k ← 0                                           // Energy evaluation count.
while k < kmax and e > emax                     // While time left & not good enough:
  snew ← neighbour(s)                           // Pick some neighbour.
  enew ← E(snew)                                // Compute its energy.
  if enew < ebest then                          // Is this a new best?
    sbest ← snew; ebest ← enew                  // Save 'new neighbour' to 'best found'.
  if P(e, enew, temp(k/kmax)) > random() then   // Should we move to it?
    s ← snew; e ← enew                          // Yes, change state.
  k ← k + 1                                     // One more evaluation done
return sbest                                    // Return the best solution found.

The following is an adaptation of the technique. My supervisor said the idea is fine in theory.

First I pick up some allocation (i.e. an entire dictionary of students and their allocated projects, including the ranks for the projects) from entire set of randomised allocations, copy it and pass it to my function. Let's call this allocation aOld (it is a dictionary). aOld has a weight related to it called wOld. The weighting is described below.

The function does the following:

Let this allocation, aOld be the best_node
From all the students, pick a random number of students and stick in a list
Strip (DEALLOCATE) them of their projects ++ reflect the changes for projects (allocated parameter is now False) and lecturers (free up slots if one or more of their projects are no longer allocated)
Randomise that list
Try assigning (REALLOCATE) everyone in that list projects again
Calculate the weight (add up ranks, rank 1 = 1, rank 2 = 2... and no project rank = 101)
For this new allocation aNew, if the weight wNew is smaller than the allocation weight wOld I picked up at the beginning, then this is the best_node (as defined by the Simulated Annealing algorithm above). Apply the algorithm to aNew and continue.
If wOld < wNew, then apply the algorithm to aOld again and continue.

The allocations/data-points are expressed as "nodes" such that a node = (weight, allocation_dict, projects_dict, lecturers_dict)

Right now, I can only perform this algorithm once, but I'll need to try for a number N (denoted by kmax in the Wikipedia snippet) and make sure I always have with me, the previous node and the best_node.

So that I don't modify my original dictionaries (which I might want to reset to), I've done a shallow copy of the dictionaries. From what I've read in the docs, it seems that it only copies the references and since my dictionaries contain objects, changing the copied dictionary ends up changing the objects anyway. So I tried to use copy.deepcopy().These dictionaries refer to objects that have been mapped with SQLA.

Questions:

I've been given some solutions to the problems faced but due to my über green-ness with using Python, they all sound rather cryptic to me.

Deepcopy isn't playing nicely with SQLA. I've been told thatdeepcopy on ORM objects probably has issues that prevent it from working as you'd expect. Apparently I'd be better off "building copy constructors, i.e. def copy(self): return FooBar(....)." Can someone please explain what that means?
I checked and found out that deepcopy has issues because SQLAlchemy places extra information on your objects, i.e. an _sa_instance_state attribute, that I wouldn't want in the copy but is necessary for the object to have. I've been told: "There are ways to manually blow away the old _sa_instance_state and put a new one on the object, but the most straightforward is to make a new object with __init__() and set up the attributes that are significant, instead of doing a full deep copy." What exactly does that mean? Do I create a new, unmapped class similar to the old, mapped one?
An alternate solution is that I'd have to "implement __deepcopy__() on your objects and ensure that a new _sa_instance_state is set up, there are functions in sqlalchemy.orm.attributes which can help with that." Once again this is beyond me so could someone kindly explain what it means?
A more general question: given the above information are there any suggestions 开发者_JAVA技巧on how I can maintain the information/state for the best_node (which must always persist through my while loop) and the previous_node, if my actual objects (referenced by the dictionaries, therefore the nodes) are changing due to the deallocation/reallocation taking place? That is, without using copy?

I have another possible solution: use transactions. This probably still isn't the best solution but implementing it should be faster.

Firstly create your session like this:

# transactional session
Session = sessionmaker(transactional=True)
sess = Session()

That way it will be transactional. The way transactions work is that sess.commit() will make your changes permanent while sess.rollback() will revert them.

In the case of simulated annealing you want to commit when you find a new best solution. At any later point, you can invoke rollback() to revert the status back to that position.

You don't want to copy sqlalchemy objects like that. You could implement your own methods which make the copies easily enough, but that is probably not want you want. You don't want copies of students and projects in your database do you? So don't copy that data.

So you have a dictionary which holds your allocations. During the process you should never modify the SQLAlchemy objects. All information that can be modified should be stored in those dictionaries. If you need to modify the objects to take that into account, copy the data back at the end.