Potential use of Python decorator or other refactorization: iterative optimization_问答_开发者

Forgive me for yet another question on Python decorators. I did read through many of them, but I wonder what the best solution to the specific following problem is.

I have written several functions that do some form of gradient descent in numpy/scipy. Given a matrix X, I try to iteratively minimize some distance, d(X, AS), as functions of A and S. Each algorithm follows the same basic procedure, but each has a different update rule. For example, here were two of my functions (note the only difference is in the update rule):

def algo1(X, A=None, S=None, K=2, maxiter=10, c=0.1):
    M, N = X.shape
    if A is None:
        A = matrix(rand(M, K))
    if S is None:
        S = matrix(rand(K, N))
    for iter in range(maxiter):
        # Begin update rule.
        A = multiply(A, (X*S.T + c)/(A*S*S.T + c))
        S = multiply(S, (A.T*X + c)/(A.T*A*S + c))
        # End update rule.
        for k in range(K):
            na = norm(A[:,k])
            A[:,k] /= na
            S[k,:] *= na
    return A, S

... and the other:

def algo2(X, A=None, S=None, K=2, maxiter=10, c=0.1):
    M, N = X.shape
    O = matrix(ones([M, N]))
    if A is None:
        A = matrix(rand(M, K))
    if S is None:
        S = matrix(rand(K, N))
    for iter in range(maxiter):
        # Begin update rule.
        A = multiply(A, ((X/(A*S))*S.T + c)/(O*S.T + c))
        S = multiply(S, (A.T*(X/(A*S)) + c)/(A.T*O + c))
        # End update rule.
        for k in range(K):
            na = norm(A[:,k])
            A[:,k] /= na
            S[k,:] *= na
    return A, S

Both functions are successful on their own. Obviously, these functions are asking to be refactored. The unit of code that differs is the update rule. So here is my attem开发者_开发问答pt at refactoring:

@iterate
def algo1(X, A=None, S=None, K=2, maxiter=10, c=0.1):
    A = multiply(A, (X*S.T + c)/(A*S*S.T + c))
    S = multiply(S, (A.T*X + c)/(A.T*A*S + c))

@iterate
def algo2(X, A=None, S=None, K=2, maxiter=10, c=0.1):
    A = multiply(A, ((X/(A*S))*S.T + c)/(O*S.T + c))
    S = multiply(S, (A.T*(X/(A*S)) + c)/(A.T*O + c))

Here are some potential function calls:

A, S = algo1(X)
A, S = algo1(X, A0, S0, maxiter=50, c=0.2)
A, S = algo1(X, K=10, maxiter=40)

Questions:

What technique is best suited for refactoring this code? Function decorators?
If so, how would you write iterate? What confuses me, in particular, are the arguments/parameters, e.g., with vs. without default values, accessing them in the decorator and "wrapper", etc. For example, the update rules themselves do not require K, but the initialization code does, so I wonder if my function signatures are correct.

EDIT: Thank you for the help. More questions:

Is it true that a wrapper (e.g., inner) is only necessary when parameters are being passed? Because I see decorator examples without wrappers, and no parameters are passed, and they work just fine.
From reading the Python docs some more, functools appears useful; is its main purpose to preserve the metadata of the original function (e.g., algo1.__name__ and algo1.__doc__)?
With the signatures def algo1(X, A, S, c) and def inner(X, A=None, S=None, K=2, maxiter=10, c=0.1), the call algo1(X, maxiter=20) still works. Syntactically, I'm not sure why that is. For learning purposes, could you clarify (or cite a reference)? Thanks!

The following should work well as the decorator you want to use:

import functools

def iterate(update):
    @functools.wraps(update)
    def inner(X, A=None, S=None, K=2, maxiter=10, c=0.1):
        M, N = X.shape
        O = matrix(ones([M, N]))
        if A is None:
            A = matrix(rand(M, K))
        if S is None:
            S = matrix(rand(K, N))
        for iter in range(maxiter):
            A, S = update(X, A, S, K, maxiter, c)
            for k in range(K):
                na = norm(A[:,k])
                A[:,k] /= na
                S[k,:] *= na
        return A, S
    return inner

As you noticed, you could simplify algo1's and algo2's signatures, but it's not really a crucial part, and maybe keeping the signatures intact can simplify your testing and refactoring. If you do want to simplify, you'll change the def statements for those to, say,

def algo1(X, A, S, c):

and similarly simplify the call in the iterator decorate -- there's no need for two of the arguments, nor for the default values. However, avoiding this simplification part can actually make your life simpler -- it's normally simpler if the decorated function, and the result of decorating it, keep exactly the same signature as each other, unless you have really specific needs to the contrary.

edit: the OP keeps piling on questions onto this question...:

EDIT: Thank you for the help. More questions:

Is it true that a wrapper (e.g., inner) is only necessary when parameters are being passed? Because I see decorator examples without wrappers, and no parameters are passed, and they work just fine.

A decorator used without parameters (in the @decorname use) is called with the function being decorated, and must return a function; a decorator used with parameters (like @decorname(23)) must return a ("higher-order") function which in turn is called with the function being decorated, and must return a function. Whether the function being decorated takes parameter or not, does not change this set of rules. It's technically possible to achieve this without inner functions (which I assume is what you mean by "wrappers"?) but it's pretty rare to do so.

From reading the Python docs some more, functools appears useful; is its main purpose to preserve the metadata of the original function (e.g., algo1.name and algo1.doc)?

Yes, functools.wraps is used exactly for this purpose (functools also contains partial which has a completely different purpose).

With the signatures def algo1(X, A, S, c) and def inner(X, A=None, S=None, K=2, maxiter=10, c=0.1), the call algo1(X, maxiter=20) still works. Syntactically, I'm not sure why that is. For learning purposes, could you clarify (or cite a reference)? Thanks!

It's because inner is the function that's actually called with those parameters (after algo1 has been decorated) and only passes down (to the "real underlying algo1) parameters X, A, S, c (in the version where the wrapped algo1 is given the simplified signature). The problem, as I mentioned above, is that this makes the metadata (specifically the signature) different between the function getting decorated, and the resulting decorated function; that is pretty confusing to read and maintain, so one normally keeps the same signature at both levels, save special circumstances.