來源自 這裡
Preface :
The pickle module implements a fundamental, but powerful algorithm for serializing and de-serializing a Python object structure. "Pickling" is the process whereby a Python object hierarchy is converted into a byte stream, and "unpickling" is the inverse operation, whereby a byte stream is converted back into an object hierarchy.
Relationship to other Python modules :
The pickle module has an optimized cousin called the cPickle module. As its name implies, cPickle is written in C, so it can be up to 1000 times faster than pickle. However it does not support subclassing of the Pickler() and Unpickler() classes, because in cPickle these are functions, not classes. Most applications have no need for this functionality, and can benefit from the improved performance of cPickle. Other than that, the interfaces of the two modules are nearly identical; the common interface is described in this manual and differences are pointed out where necessary. In the following discussions, we use the term "pickle" to collectively describe the pickle and cPickle modules.
Note that serialization is a more primitive notion than persistence; although pickle reads and writes file objects, it does not handle the issue of naming persistent objects, nor the (even more complicated) issue of concurrent access to persistent objects. The pickle module can transform a complex object into a byte stream and it can transform the byte stream into an object with the same internal structure. Perhaps the most obvious thing to do with these byte streams is to write them onto a file, but it is also conceivable to send them across a network or store them in a database. The module shelve provides a simple interface to pickle and unpickle objects onDBM-style database files.
Data stream format :
The data format used by pickle is Python-specific. It means that non-Python programs may not be able to reconstruct pickled Python objects.
By default, the pickle data format uses a printable ASCII representation. This is slightly more voluminous than a binary representation. The big advantage of using printable ASCII (and of some other characteristics of pickle‘s representation) is that for debugging or recovery purposes it is possible for a human to read the pickled file with a standard text editor.
There are currently 3 different protocols which can be used for pickling :
If a protocol is not specified, protocol 0 is used. If protocol is specified as a negative value or HIGHEST_PROTOCOL, the highest protocol version available will be used. (Changed in version 2.3: Introduced the protocol parameter.)
A binary format, which is slightly more efficient, can be chosen by specifying a protocol version >= 1.
Usage :
To serialize an object hierarchy, you first create a pickler, then you call the pickler’s dump() method. To de-serialize a data stream, you first create an unpickler, then you call the unpickler’s load() method. The pickle module provides the following constant :
- pickle.HIGHEST_PROTOCOL
The pickle module provides the following functions to make the pickling process more convenient :
- pickle.dump(obj, file[, protocol])
- pickle.load(file)
- pickle.dumps(obj[, protocol])
The pickle module exports two callables Pickler and Unpickler :
- class pickle.Pickler(file[, protocol])
Changed in version 2.3: Introduced the protocol parameter.
This takes a file-like object to which it will write a pickle data stream.
file must have a write() method that accepts a single string argument. It can thus be an open file object, a StringIO object, or any other custom object that meets this interface. Pickler objects define one (or two) public methods :
- dump(obj)
- clear_memo()
It is possible to make multiple calls to the dump() method of the same Pickler instance. These must then be matched to the same number of calls to the load() method of the corresponding Unpickler instance. If the same object is pickled by multiple dump() calls, the load() will all yield references to the same object.
- class pickle.Unpickler(file)
This takes a file-like object from which it will read a pickle data stream. This class automatically determines whether the data stream was written in binary mode or not, so it does not need a flag as in the Pickler factory.
file must have two methods, a read() method that takes an integer argument, and a readline() method that requires no arguments. Both methods should return a string. Thus file can be a file object opened for reading, a StringIO object, or any other custom object that meets this interface.
Unpickler objects have one (or two) public methods :
- load()
- noload()
What can be pickled and unpickled?
The following types can be pickled :
Attempts to pickle unpicklable objects will raise the PicklingError exception; when this happens, an unspecified number of bytes may have already been written to the underlying file. Trying to pickle a highly recursive data structure may exceed the maximum recursion depth, a RuntimeError will be raised in this case. You can carefully raise this limit with sys.setrecursionlimit().
Note that functions (built-in and user-defined) are pickled by "fully qualified" name reference, not by value. This means that only the function name is pickled, along with the name of the module the function is defined in. Neither the function’s code, nor any of its function attributes are pickled. Thus the defining module must be importable in the unpickling environment, and the module must contain the named object, otherwise an exception will be raised.
Similarly, classes are pickled by named reference, so the same restrictions in the unpickling environment apply. Note that none of the class’s code or data is pickled, so in the following example the class attribute attr is not restored in the unpickling environment :
These restrictions are why picklable functions and classes must be defined in the top level of a module.
Similarly, when class instances are pickled, their class’s code and data are not pickled along with them. Only the instance data are pickled. This is done on purpose, so you can fix bugs in a class or add methods to the class and still load objects that were created with an earlier version of the class. If you plan to have long-lived objects that will see many versions of a class, it may be worthwhile to put a version number in the objects so that suitable conversions can be made by the class’s__setstate__() method.
Example :
For the simplest code, use the dump() and load() functions. Note that a self-referencing list is pickled and restored correctly :
The following example reads the resulting pickled data. When reading a pickle-containing file, you should open the file in binary mode because you can’t be sure if the ASCII or binary format was used : (pickle_ex2.py)
Below is the executing result :
Supplement :
* The pickle protocol
* Subclassing Unpicklers
* cPickle — A faster pickle
Preface :
The pickle module implements a fundamental, but powerful algorithm for serializing and de-serializing a Python object structure. "Pickling" is the process whereby a Python object hierarchy is converted into a byte stream, and "unpickling" is the inverse operation, whereby a byte stream is converted back into an object hierarchy.
Relationship to other Python modules :
The pickle module has an optimized cousin called the cPickle module. As its name implies, cPickle is written in C, so it can be up to 1000 times faster than pickle. However it does not support subclassing of the Pickler() and Unpickler() classes, because in cPickle these are functions, not classes. Most applications have no need for this functionality, and can benefit from the improved performance of cPickle. Other than that, the interfaces of the two modules are nearly identical; the common interface is described in this manual and differences are pointed out where necessary. In the following discussions, we use the term "pickle" to collectively describe the pickle and cPickle modules.
Note that serialization is a more primitive notion than persistence; although pickle reads and writes file objects, it does not handle the issue of naming persistent objects, nor the (even more complicated) issue of concurrent access to persistent objects. The pickle module can transform a complex object into a byte stream and it can transform the byte stream into an object with the same internal structure. Perhaps the most obvious thing to do with these byte streams is to write them onto a file, but it is also conceivable to send them across a network or store them in a database. The module shelve provides a simple interface to pickle and unpickle objects onDBM-style database files.
Data stream format :
The data format used by pickle is Python-specific. It means that non-Python programs may not be able to reconstruct pickled Python objects.
By default, the pickle data format uses a printable ASCII representation. This is slightly more voluminous than a binary representation. The big advantage of using printable ASCII (and of some other characteristics of pickle‘s representation) is that for debugging or recovery purposes it is possible for a human to read the pickled file with a standard text editor.
There are currently 3 different protocols which can be used for pickling :
If a protocol is not specified, protocol 0 is used. If protocol is specified as a negative value or HIGHEST_PROTOCOL, the highest protocol version available will be used. (Changed in version 2.3: Introduced the protocol parameter.)
A binary format, which is slightly more efficient, can be chosen by specifying a protocol version >= 1.
Usage :
To serialize an object hierarchy, you first create a pickler, then you call the pickler’s dump() method. To de-serialize a data stream, you first create an unpickler, then you call the unpickler’s load() method. The pickle module provides the following constant :
- pickle.HIGHEST_PROTOCOL
The pickle module provides the following functions to make the pickling process more convenient :
- pickle.dump(obj, file[, protocol])
- pickle.load(file)
- pickle.dumps(obj[, protocol])
The pickle module exports two callables Pickler and Unpickler :
- class pickle.Pickler(file[, protocol])
Changed in version 2.3: Introduced the protocol parameter.
This takes a file-like object to which it will write a pickle data stream.
file must have a write() method that accepts a single string argument. It can thus be an open file object, a StringIO object, or any other custom object that meets this interface. Pickler objects define one (or two) public methods :
- dump(obj)
- clear_memo()
It is possible to make multiple calls to the dump() method of the same Pickler instance. These must then be matched to the same number of calls to the load() method of the corresponding Unpickler instance. If the same object is pickled by multiple dump() calls, the load() will all yield references to the same object.
- class pickle.Unpickler(file)
This takes a file-like object from which it will read a pickle data stream. This class automatically determines whether the data stream was written in binary mode or not, so it does not need a flag as in the Pickler factory.
file must have two methods, a read() method that takes an integer argument, and a readline() method that requires no arguments. Both methods should return a string. Thus file can be a file object opened for reading, a StringIO object, or any other custom object that meets this interface.
Unpickler objects have one (or two) public methods :
- load()
- noload()
What can be pickled and unpickled?
The following types can be pickled :
Attempts to pickle unpicklable objects will raise the PicklingError exception; when this happens, an unspecified number of bytes may have already been written to the underlying file. Trying to pickle a highly recursive data structure may exceed the maximum recursion depth, a RuntimeError will be raised in this case. You can carefully raise this limit with sys.setrecursionlimit().
Note that functions (built-in and user-defined) are pickled by "fully qualified" name reference, not by value. This means that only the function name is pickled, along with the name of the module the function is defined in. Neither the function’s code, nor any of its function attributes are pickled. Thus the defining module must be importable in the unpickling environment, and the module must contain the named object, otherwise an exception will be raised.
Similarly, classes are pickled by named reference, so the same restrictions in the unpickling environment apply. Note that none of the class’s code or data is pickled, so in the following example the class attribute attr is not restored in the unpickling environment :
- class Foo:
- attr = 'a class attr'
- picklestring = pickle.dumps(Foo)
Similarly, when class instances are pickled, their class’s code and data are not pickled along with them. Only the instance data are pickled. This is done on purpose, so you can fix bugs in a class or add methods to the class and still load objects that were created with an earlier version of the class. If you plan to have long-lived objects that will see many versions of a class, it may be worthwhile to put a version number in the objects so that suitable conversions can be made by the class’s__setstate__() method.
Example :
For the simplest code, use the dump() and load() functions. Note that a self-referencing list is pickled and restored correctly :
The following example reads the resulting pickled data. When reading a pickle-containing file, you should open the file in binary mode because you can’t be sure if the ASCII or binary format was used : (pickle_ex2.py)
Below is the executing result :
Supplement :
* The pickle protocol
* Subclassing Unpicklers
* cPickle — A faster pickle
沒有留言:
張貼留言