开发者

lazy load or early load for python?

开发者 https://www.devze.com 2022-12-12 07:52 出处:网络
We\'ve got the following code sample: big_static_data = { \"key1\" : { \"subkey1\"开发者_Python百科 : \"subvalue1\",

We've got the following code sample:

big_static_data = {
  "key1" : {
     "subkey1"开发者_Python百科 : "subvalue1",
     ...
     },
  "key2" : 
   ...
}
class StaticDataEarlyLoad:
    def __init__(self):
        self.static_data = big_static_data
        # other init
    def handle_use_id(self, id):
        return complex_handle(self.static_data, id)
    ...
class StaticDataLazyLoad:
    def __init__(self):
        # not init static data
        # other init
    def handle_use_id(self, id):
        return complex_handle(big_static_data, id)
    ...

Just as the above codes say, whenever we call the instance's handle_use_id, we may get different performance issues.

IMO, early load will load the data when the instance is created, and will be in memory till the instance is garbaged. And for late load, the static data won't be loaded till we call the handle_use_id method. Am I right? (Since I'm not so clear with Python's internal, I'm not sure how long the instance will last till garbaged). And If I'm right, the early load means a big memory requirement and the late load means we have to load the data each time when invoking the method( a big overhead?)

Now, we are a web based project, So which should be selected as the best approach? (handle_use_id will be invoked very frequently.)

Thanks.


In your example, StaticDataLazyLoad (once the syntax for init is correct) wont make a big difference.

"big_static_data" is initialized ("loaded") when the module is imported. It will immediately require some memory, no matter whether an instance of your classes is created or not.

An instance of StaticDataEarlyLoad will just create a new reference to big_static_data, not a new copy.

Thus, a lookup in StaticDataEarlyLoad may be slightly faster, since the data is referenced via self in the local scope (lookup "self", then lookup "self.static_data").

A lookup in StaticDataLazyLoad will not find "big_static_data" in the local scope, python will then look it up in the global scope and find it. Since the global scope is probably larger, this lookup may take longer than the lookup of "self.static_data".


big_static_data is created once at the beginning of the file (at least in the code that you show).

This consumes memory.

When you create an instance of StaticDataEarlyLoad,

StaticDataEarlyLoad().static_data is a reference to big_static_data. It consumes a very minor amount of memory. It merely points at the same dictionary that big_static_data points to. No copy of big_static_data is made, there is no real "loading" going on.

When instance StaticDataEarlyLoad() gets garbage-collected, a little memory is freed, but the big_static_data remains.

StaticDataLazyLoad does much the same thing, but doesn't create an attribute static_data. It just references big_static_data directly. The difference in memory consumption between StaticDataEarlyLoad and StaticDataLazyLoad is very minor. And there will be essentially no difference in speed.

it is always best to make explicit what a class depends upon. StaticDataEarlyLoad depends on big_static_data. Therefore, you should define

class StaticDataEarlyLoad:
    def __init__(self,static_data):
        self.static_data = static_data

And initialize instances with StaticDataEarlyLoad(big_static_data).

There is essentially no difference in speed between this definition and the one you posted. Putting dependencies into the call signature of __init__ is just a good idea for the sake of organization, and after all you are using Python's OOP for good control of organization, right?

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号