开发者

Is frozenset adequate for caching of symmetric input data in a python dict?

开发者 https://www.devze.com 2022-12-25 07:59 出处:网络
The title more or less says it all: I have a function which takes symmetric input in two arguments, e.g. something like

The title more or less says it all:

I have a function which takes symmetric input in two arguments, e.g. something like

def f(a1, a2):
    return heavy_stuff(abs(a1 - a2))

Now, I want to introduce some caching method. Would it be correct / pythonic / reasonably efficient to do something like this:

cache = {}
def g(a1, a2):
    fs =frozenset((tuple(a1), tuple(a2)))
    if fs not in cache:
        cache[fs] = f(a1, a2)
    return cache[fs]

Or would there be some better way?

Edit开发者_Go百科: a1 and a2 might be the rows of a numpy array; that’s why I wrap them in a tuple each.


Python always computes all arguments you're passing to a function, and only then does it call the function. In other words, like most other languages, Python is "eager" in its evaluation (the major exception today is probably Haskell, but that doesn't help you;-).

So setdefault is a very unsuitable approach for caching! Whenever you do

cache.setdefault(akey, f(x, y))

you're first calling f(x, y), with all its computational cost, then maybe dropping the results of that computation on the floor; this makes the caching totally ineffective.

Rather, always do it as follows:

akey = whatever(x, y)
if akey not in cache:
    cache[akey] = f(x, y)
return cache[akey]

or the like -- there are a few other possible idioms, especially if e.g. you know that f will never return None:

result = cache.get(akey)
if result is None:
    result = cache[akey] = f(x, y)
return result

As for the secondary issue of what's an appropriate whatever for the key computation, given that you know that f is symmetrical, I think frozenset is probably OK; although (if the components of x and y are comparable, as well as hashable -- i.e. it wouldn't work with complex numbers) you might want to consider

ta1 = tuple(a1)
ta2 = tuple(a2)
if ta1 > ta2: key = ta1, ta2
else: key = ta2, ta1

the relative performance depends on the cost of comparing, vs that of hashing, the items in a1 and a2. Differences are likely to be minor, anyway.

0

精彩评论

暂无评论...
验证码 换一张
取 消