开发者

Understanding Pickling in Python

开发者 https://www.devze.com 2023-04-06 23:35 出处:网络
I 开发者_运维问答have recently got an assignment where I need to put a dictionary (where each key refers to a list) in pickled form. The only problem is I have no idea what pickled form is. Could anyo

I 开发者_运维问答have recently got an assignment where I need to put a dictionary (where each key refers to a list) in pickled form. The only problem is I have no idea what pickled form is. Could anyone point me in the right direction of some good resources to help me learn this concept?


The pickle module implements a fundamental, but powerful algorithm for serializing and de-serializing a Python object structure.

Pickling - is the process whereby a Python object hierarchy is converted into a byte stream, and Unpickling - is the inverse operation, whereby a byte stream is converted back into an object hierarchy.

Pickling (and unpickling) is alternatively known as serialization, marshalling, or flattening.

import pickle

data1 = {'a': [1, 2.0, 3, 4+6j],
         'b': ('string', u'Unicode string'),
         'c': None}

selfref_list = [1, 2, 3]
selfref_list.append(selfref_list)

output = open('data.pkl', 'wb')

# Pickle dictionary using protocol 0.
pickle.dump(data1, output)

# Pickle the list using the highest protocol available.
pickle.dump(selfref_list, output, -1)

output.close()

To read from a pickled file -

import pprint, pickle

pkl_file = open('data.pkl', 'rb')

data1 = pickle.load(pkl_file)
pprint.pprint(data1)

data2 = pickle.load(pkl_file)
pprint.pprint(data2)

pkl_file.close()

source - https://docs.python.org/2/library/pickle.html


Pickling is a mini-language that can be used to convert the relevant state from a python object into a string, where this string uniquely represents the object. Then (un)pickling can be used to convert the string to a live object, by "reconstructing" the object from the saved state founding the string.

>>> import pickle
>>> 
>>> class Foo(object):
...   y = 1
...   def __init__(self, x):
...     self.x = x
...     return
...   def bar(self, y):
...     return self.x + y
...   def baz(self, y):
...     Foo.y = y  
...     return self.bar(y)
... 
>>> f = Foo(2)
>>> f.baz(3)
5
>>> f.y
3
>>> pickle.dumps(f)
"ccopy_reg\n_reconstructor\np0\n(c__main__\nFoo\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'x'\np6\nI2\nsb."

What you can see here is that pickle doesn't save the source code for the class, but does store a reference to the class definition. Basically, you can almost read the picked string… it says (roughly translated) "call copy_reg's reconstructor where the arguments are the class defined by __main__.Foo and then do other stuff". The other stuff is the saved state of the instance. If you look deeper, you can extract that "string x" is set to "the integer 2" (roughly: S'x'\np6\nI2). This is actually a clipped part of the pickled string for a dictionary entry… the dict being f.__dict__, which is {'x': 2}. If you look at the source code for pickle, it very clearly gives a translation for each type of object and operation from python to pickled byte code.

Note also that there are different variants of the pickling language. The default is protocol 0, which is more human-readable. There's also protocol 2, shown below (and 1,3, and 4, depending on the version of python you are using).

>>> pickle.dumps([1,2,3])
'(lp0\nI1\naI2\naI3\na.'
>>> 
>>> pickle.dumps([1,2,3], -1)
'\x80\x02]q\x00(K\x01K\x02K\x03e.'

Again, it's still a dialect of the pickling language, and you can see that the protocol 0 string says "get a list, include I1, I2, I3", while the protocol 2 is harder to read, but says the same thing. The first bit \x80\x02 indicates that it's protocol 2 -- then you have ] which says it's a list, then again you can see the integers 1,2,3 in there. Again, check the source code for pickle to see the exact mapping for the pickling language.

To reverse the pickling to a string, use load/loads.

>>> p = pickle.dumps([1,2,3])
>>> pickle.loads(p)
[1, 2, 3]


Pickling is just serialization: putting data into a form that can be stored in a file and retrieved later. Here are the docs on the pickle module:

http://docs.python.org/release/2.7/library/pickle.html


http://docs.python.org/library/pickle.html#example

import pickle

data1 = {'a': [1, 2.0, 3, 4+6j],
         'b': ('string', u'Unicode string'),
         'c': None}

selfref_list = [1, 2, 3]
selfref_list.append(selfref_list)

output = open('data.pkl', 'wb')

# Pickle dictionary using protocol 0.
pickle.dump(data1, output)

# Pickle the list using the highest protocol available.
pickle.dump(selfref_list, output, -1)

output.close()


Pickling in Python is used to serialize and de-serialize Python objects, like dictionary in your case. I usually use cPickle module as it can be much faster than the Pickle module.

import cPickle as pickle    

def serializeObject(pythonObj):
    return pickle.dumps(pythonObj, pickle.HIGHEST_PROTOCOL)

def deSerializeObject(pickledObj):
    return pickle.loads(pickledObj)


Sometimes we want to save the objects to retrieve them later (Even after the Program that generated the data has terminated). Or we want to transmit the object to someone or something else outside our application. Pickle module is used for serializing and deserializing the object.

serializing object (Pickling): Create a representation of an object.
deserializing object (Unpickling): Re-load the object from representation.

dump: pickle to file
load: unpickle from file
dumps: returns a pickled representation. We can store it in a variable.
loads: unpickle from the supplied variable.

Example:

import pickle

print("Using dumps and loads to store it in variable")
list1 = [2, 4]
dict1 = {1: list1, 2: 'hello', 3: list1}
pickle_dict = pickle.dumps(dict1)
print(pickle_dict)

dict2 = pickle.loads(pickle_dict)
print(dict2)

# obj1==obj2 => True
# obj1 is obj2 => False

print(id(dict1.get(1)), id(dict1.get(3)))
print(id(dict2.get(1)), id(dict2.get(3)))
print("*" * 100)
print("Using dump and load to store it in File ")

cars = ["Audi", "BMW", "Maruti 800", "Maruti Suzuki"]
file_name = "mycar.pkl"
fileobj = open(file_name, 'wb')
pickle.dump(cars, fileobj)
fileobj.close();

file_name = "mycar.pkl"
fileobj = open(file_name, 'rb')
mycar = pickle.load(fileobj)
print(mycar)


Pickling allows you to serialize and de-serializing Python object structures. In short, Pickling is a way to convert a python object into a character stream so that this character stream contains all the information necessary to reconstruct the object in another python script.

import pickle

def pickle_data():
    data = {
           'name': 'sanjay',
           'profession': 'Software Engineer',
           'country': 'India'
        }
    filename = 'PersonalInfo'
    outfile = open(filename, 'wb')
    pickle.dump(data,outfile)
    outfile.close()

pickle_data()
0

精彩评论

暂无评论...
验证码 换一张
取 消