程式扎記: [Quick Python] 15. Classes and object-oriented programming

Private variables and private methods :
A private variable or private method is one that can’t be seen outside of the methods of the class in which it’s defined. Private variables and methods are useful for a number of reasons. They enhance security and reliability by selectively denying access to important or delicate parts of an object’s implementation. They avoid name clashes that can arise from the use of inheritance. A class may define a private variable and inherit from a class that defines a private variable of the same name, but this doesn’t cause a problem, because the fact that the variables are private ensures that separate copies of them are kept. Finally, private variables make it easier to read code, because they explicitly indicate what is used only internally in a class. Anything else is the class’s interface.

Most languages that define private variables do so through the use of a private or other similar keyword. The convention in Python is simpler, and it also makes it easier to immediately see what is private and what isn’t. Any method or instance variable whose name begins—but doesn’t end—with a double underscore (__) is private; anything else isn’t private.

As an example, consider the following class definition :

view plaincopy to clipboardprint?
class Mine:  
    def __init__(self):  
        self.x = 2  
        self.__y = 3  # Defines __y as private by using leading double underscores  
    def print_y(self):  
        print(self.__y)  

Using this definition, create an instance of the class :

>>> m = Mine()

x isn’t a private variable, so it’s directly accessible :

>>> print(m.x)
2

__y is a private variable. Trying to access it directly raises an error :

>>> print(m.__y)
Traceback (most recent call last):
File "", line 1, in
AttributeError: 'Mine' object has no attribute '__y'

The print_y method isn’t private, and because it’s in the Mine class, it can access __y and print it :

>>> m.print_y()
3

Finally, you should note that the mechanism used to provide privacy is to mangle the name of private variables and private methods when the code is compiled to bytecode. What specifically happens is that _classname is appended to the variable name :

>>> dir(m)
['_Mine__y', 'x', ...]
>>> m._Mine__y
3

The purpose is to avoid any accidental accesses. If someone wanted to, they could access the value. But by performing the mangling in this easily readable form, debugging is made easy.

Using @property for more flexible instance variables :
Python allows you as the programmer to access instance variables directly, without the extra machinery of getter and setter methods often used in Java and other OO languages. This lack of getters and setters makes writing Python classes cleaner and easier; but in some situations, using getter and setter methods can be handy. Suppose you want a value before you put it into an instance variable or where it would be handy to figure out an attribute’s value on the fly. In both cases, getter and setter methods would do the job but at the cost of losing Python’s easy instance variable access.

The answer is to use a property. A property combines the ability to pass access to an instance variable through methods like getters and setters and the straightforward access to instance variables through dot notation :

view plaincopy to clipboardprint?
class Temperature:  
    def __init__(self):  
        self._temp_fahr = 0  
  
    @property  
    def temp(self):  
        return (self._temp_fahr - 32) * 5 / 9  

Without a setter, such a property is read-only. To change the property, you need to add a setter :

view plaincopy to clipboardprint?
@temp.setter  
def temp(self, new_temp):  
    self._temp_fahr = new_temp * 9 / 5 + 32  

Now, you can use standard dot notation to both get and set the property temp. Notice that the name of the method remains the same, but the decorator changes to the property name (temp in this case) plus .setter to indicate that a setter for the temp property is being defined :

>>> t = Temperature()
>>> t._temp_fahr
0
>>> t.temp
-17.77777777777778 # (1)
>>> t.temp = 34
>>> t._temp_fahr
93.2 # (2)
>>> t.temp
34.0

The 0 in _temp_fahr is converted to centigrade before it’s returned (1). The 34 is converted back to Fahrenheit by the setter (2).

One big advantage of Python’s ability to add properties is that you can do initial development with plain-old instance variables and then seamlessly change to properties whenever and wherever you need to without changing any client code—the access is still the same, using dot notation.

Scoping rules and namespaces for class instances :
Now you have all the pieces to put together a picture of the scoping rules and namespaces for a class instance. When you’re in a method of a class, you have direct access to the local namespace (parameters and variables declared in the method), the global namespace (functions and variables declared at the module level), and thebuilt-in namespace (built-in functions and built-in exceptions). These three namespaces are searched in the following order: local, global, and built in (see figure 15.1) :

You also have access through the self variable to our instance’s namespace (instance variables, private instance variables, and superclass instance variables), its class’s namespace (methods, class variables, private methods, and private class variables), and its superclass’s namespace (superclass methods and superclass class variables). These three namespaces are searched in the order instance, class, and then superclass (see figure 15.2) :

Private superclass instance variables, private superclass methods, and private superclass class variables can’t be accessed using self. A class is able to hide these names from its children. The module in listing below puts these two together in one place to concretely demonstrate what can be accessed from within a method :

- Exam08.py :

view plain copy to clipboard print ?

"""cs module: class scope demonstration module."""

mv ="module variable: mv"

def mf():

    return "module function (can be used like a class method in " \

           "other languages): mf()"

class SC:

    scv = "superclass class variable: self.scv"

    __pscv = "private superclass class variable: no access"

    def __init__(self):

        self.siv = "superclass instance variable: self.siv " \

                   "(but use SC.siv for assignment)"

        self.__psiv = "private superclass instance variable: " \

                      "no access"

    def sm(self):

        return "superclass method: self.sm()"

    def __spm(self):

        return "superclass private method: no access"



class C(SC):

    cv = "class variable: self.cv (but use C.cv for assignment)"

    __pcv = "class private variable: self.__pcv (but use C.__pcv " \

            "for assignment)"

    def __init__(self):

        SC.__init__(self)

        self.__piv = "private instance variable: self.__piv"

    def m2(self):

        return "method: self.m2()"

    def __pm(self):

        return "private method: self.__pm()"

    def m(self, p="parameter: p"):

        lv = "local variable: lv"

        self.iv = "instance variable: self.xi"

        print("Access local, global and built-in " \

              "namespaces directly")

        print("local namespace:", list(locals().keys()))

        print(p)

        print(lv) # Instance variable

        print("global namespace:", list(globals().keys()))

        print(mv) # Module variable

        print(mf()) # Module function

        print("Access instance, class, and superclass namespaces " \

              "through 'self'")

        print("Instance namespace:",dir(self))

        print(self.iv) # Instance variable

        print(self.__piv)  # Private instance variable

        print(self.siv)  # Superclass instance variable

        print("Class namespace:",dir(C))

        print(self.cv) # Class variable

        print(self.m2())  # Method

        print(self.__pcv)  # Private class variable

        print(self.__pm())  # Private module

        print("Superclass namespace:",dir(SC))

        print(self.sm()) # Superclass method

        print(self.scv) # Superclass variable through instance

This output is considerable, so we’ll look at it in pieces. In the first part, class C’s method m’s local namespace contains the parameters self (which is our instance variable) and p along with the local variable lv (all of which can be accessed directly) :

>>> from Exam08 import *
>>> c = C()
>>> c.m()
Access local, global and built-in namespaces directly
local namespace: ['lv', 'p', 'self']
parameter: p
local variable: lv

Next, method m’s global namespace contains the module variable mv and the module function mf, (which, as described in a previous section, we can use to provide a class method functionality). There are also the classes defined in the module (the class C and the superclass SC). These can all be directly accessed :

global namespace: ['C', 'mf', '__builtins__', '__file__', '__package__', 'SC', 'mv', '__cached__', '__name__', '__doc__']
module variable: mv
module function (can be used like a class method in other languages): mf()

Instance C’s namespace contains instance variable iv and our superclass’s instance variable siv (which, as described in a previous section, is no different from our regular instance variable). It also has the mangled name of private instance variable __piv (which we can access through self) and the mangled name of our superclass’s private instance variable __psiv (which we can’t access) :

Access instance, class, and superclass namespaces through 'self'
Instance namespace: ['_C__pcv', '_C__piv', '_C__pm', '_SC__pscv', '_SC__psiv', '_SC__spm', '__class__', '__delattr__', '__dict__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'cv', 'iv', 'm', 'm2', 'scv', 'siv', 'sm']
instance variable: self.xi
private instance variable: self.__piv
superclass instance variable: self.siv (but use SC.siv for assignment)

Class C’s namespace contains the class variable cv and the mangled name of the private class variable __pcv: both can be accessed through self, but to assign them we need to use class C. It also has the class’s two methods m and m2, along with the mangled names of the private method __pm (which can be accessed through self) :

Class namespace: ['_C__pcv', '_C__pm', '_SC__pscv', '_SC__spm', '__class__', '__delattr__', '__dict__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'cv', 'm', 'm2', 'scv', 'sm']
class variable: self.cv (but use C.cv for assignment)
method: self.m2()
class private variable: self.__pcv (but use C.__pcv for assignment)
private method: self.__pm()

Finally, superclass SC’s namespace contains superclass class variable scv (which can be accessed through self, but to assign to it we need to use the superclass SC) and superclass method sm. It also contains the mangled names of private superclass method __spm and private superclass class variable __pscv, neither of which can be accessed through self :

Superclass namespace: ['_SC__pscv', '_SC__spm', '__class__', '__delattr__', '__dict__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__' , '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'scv', 'sm']
superclass method: self.sm()
superclass class variable: self.scv

This is a rather full example to decipher at first. You can use it as a reference or a base for your own exploration. As with most other concepts in Python, you can build a solid understanding of what is going on by playing around with a few simplified examples.

Destructors and memory management :
You’ve already seen class constructors (the __init__ methods). A destructor can be defined for a class as well. But unlike in C++, creating and calling a destructor isn’t necessary to ensure that the memory used by your instance is freed. Python provides automatic memory management through a reference-counting mechanism. That is, it keeps track of the number of references to your instance; when this reaches zero, the memory used by your instance is reclaimed, and any Python objects referenced by your instance have their reference counts decremented by one. For the vast majority of your classes, you won’t need to define a destructor.

C++ destructors are sometimes used to also perform cleanup tasks such as releasing or resetting system resources unrelated to memory management. To perform these functions in Python, using context managers using the with keyword or the definition of explicit cleanup or close methods for your classes is the best way to go.

If you need to, you can also define destructor methods for your classes. Python will implicitly call a class’s destructor method __del__ just before an instance is removed upon its reference count reaching zero. You can use this as a backup to ensure that your cleanup method is called. The following simple class illustrates this :

- Exam09.py :

view plain copy to clipboard print ?

class SpecialFile:

    def __init__(self, file_name):

        self.__file = open(file_name, 'w')

        self.__file.write('*** Start Special File ***\n\n')

    def write(self, str):

        self.__file.write(str)

    def writelines(self, str_list):

        self.__file.writelines(str_list)

    def __del__(self):

        print("entered __del__")

        self.close()

    def close(self):

        if self.__file:

            self.__file.write('\n\n*** End Special File ***')

            self.__file.close()

            self.__file = None

Notice that close is written so that it can be called more than once without complaint. This is what you’ll generally want to do. Also, the __del__ function has a printexpression in it. But this is just for demonstration purposes. Take the following test function :

When the function test exits, f’s reference count goes to zero and __del__ is called. Thus, in the normal case close is called twice, which is why we want close to be able to handle this. If we forgot the f.close() at the end of test, the file would still be closed properly because we’re backed up by the call to the destructor. This also happens if we reassign to the same variable without first closing the file :

>>> f = SpecialFile('testfile1')
>>> f = SpecialFile('testfile2')
entered __del__ # 原先 SpecialFile('testfile1') 的 reference count to zero. 所以它 deconstruct 被呼叫!

As with the __init__ constructor, the __del__ destructor of a class’s parent class needs to be called explicitly within a class’s own destructor. Be careful when writing a destructor. If it’s called when a program is shutting down, members of its global namespace may already have been deleted. Any exception that occurs during its execution will be ignored, other than a message being sent of the occurrence to sys.stderr. Also, there’s no guarantee that destructors will be called for all stillexisting instances when the Python interpreter exits. Check the entries for destructors in the Python Language Manual and the Python FAQ for more details. They will also give you hints as to what may be happening in cases where you think all references to your object should be gone but its destructor hasn’t been called.

Partly because of these issues, some people avoid using Python’s destructors other than possibly to flag an error when they’ve missed putting in an explicit call. They prefer that cleanup always be done explicitly. Sometimes they’re worth using, but only when you know the issues.

If you’re familiar with Java, you’re aware that this is what you have to do in that language. Java uses garbage collection, and its finalize methods aren’t called if this mechanism isn’t invoked (which may be never in some programs). Python’s destructor invocation is more deterministic. When the references to an object go away, it’s individually removed. On the other hand, if you have structures with cyclical references that have a __del__ method defined, they aren’t removed automatically. You have to go in and do this yourself. This is the main reason why defining your own __del__ method destructors isn’t recommended.

The following example illustrates the effect of a cyclical reference in Python and how you might break it. The purpose of the __del__ method in this example is only to indicate when an object is removed :

- Exam10.py :

view plain copy to clipboard print ?

class Circle:

    def __init__(self, name, parent):

        self.name = name

        self.parent = parent

        self.child = None

        if parent:

            parent.child = self

    def cleanup(self):

        self.child = self.parent = None

    def __del__(self):

        print("__del__ called on", self.name)



def test1():

    a = Circle("a", None)

    b = Circle("b", a)



def test2():

    c = Circle("c", None)

    d = Circle("d", c)

    d.cleanup()

Then you can enter interactive :

>>> test1() # Because a refer to b and b refer to a too, they won't be garbage collected even exit the method test1()
>>> test2()
__del__ called on c
__del__ called on d

Because they still refer to each other, a and b aren’t removed when test1 exits. This is a memory leak. That is, each time test1 is called, it leaks two more objects. The explicit call to the cleanup method is necessary to avoid this (Like method test2())

The cycle is broken in the cleanup method, not the destructor, and we only had to break it in one place. Python’s reference-counting mechanism took over from there. This approach is not only more reliable, but also more efficient, because it reduces the amount of work that the garbage collector has to do.

A more robust method of ensuring that our cleanup method is called is to use the try-finally compound statement. It takes the following form :

view plaincopy to clipboardprint?
try:  
    body  
finally:  
    cleanup_body  

It ensures that cleanup_body is executed regardless of how or from where body is exited. We can easily see this by writing and executing another test function for theCircle class defined earlier :

Here, with the addition of three lines of code, we’re able to ensure that our cleanup method is called when our function is left, which in this case can be via an exception, a return statement, or returning after its last statement.

Multiple inheritance :
Compiled languages place severe restrictions on the use of multiple inheritance, the ability of objects to inherit data and behavior from more than one parent class. For example, the rules for using multiple inheritance in C++ are so complex that many people avoid using it. In Java, multiple inheritance is completely disallowed, although Java does have the interface mechanism.

Python places no such restrictions on multiple inheritance. A class can inherit from any number of parent classes, in the same way it can inherit from a single parent class. In the simplest case, none of the involved classes, including those inherited indirectly through a parent class, contains instance variables or methods of the same name. In such a case, the inheriting class behaves like a synthesis of its own definitions and all of its ancestor’s definitions. For example, if class A inherits from classes B, C, andD, class B inherits from classes E and F, and class D inherits from class G (see figure 15.3), and none of these classes share method names, then an instance of class Acan be used as if it were an instance of any of the classes B–G, as well as A; an instance of class B can be used as if it were an instance of class E or F, as well as classB; and an instance of class D can be used as if it were an instance of class G, as well as class D. In terms of code, the class definitions y look like this :

view plaincopy to clipboardprint?
class E:  
    . . .  
class F:  
    . . .  
class G:  
    . . .  
class D(G):  
    . . .  
class C:  
    . . .  
class B(E, F):  
    . . .  
class A(B, C, D):  
    . . .  

The situation is more complex when some of the classes share method names, because Python must then decide which of the identical names is the correct one. For example, assume we wish to resolve a method invocation a.f(), on an instance a of class A, where f isn’t defined in A but is defined in all of F, C, and G. Which of the various methods will be invoked?

The answer lies in the order in which Python searches base classes when looking for a method not defined in the original class on which the method was invoked. In the simplest cases, Python looks through the base classes of the original class in left-toright order but always looks through all of the ancestor classes of a base class before looking in the next base class. In attempting to execute a.f(), the search goes something like this :

1. Python first looks in the class of the invoking object, class A.
2. Because A doesn’t define a method f, Python starts looking in the base classes of A. The first base class of A is B, so Python starts looking in B.
3. Because B doesn’t define a method f, Python continues its search of B by looking in the base classes of B. The first base class of B, class E.
4. E doesn’t define a method f and also has no base classes, so there is no more searching to be done in E. Python goes back to class B and looks in the next base class of B, class F.

Class F does contain a method f, and because it was the first method found with the given name, it’s the method used. The methods called f in classes C and G are ignored. Of course, using internal logic like this isn’t likely to lead to the most readable or maintainable of programs. And with more complex hierarchies, other factors come into play to make sure that no class is searched twice and to support cooperative calls to super.

But this is probably a more complex hierarchy than you’d expect to see in practice. If you stick to the more standard uses of multiple inheritance, as in the creation of mixin or addin classes, you can easily keep things readable and avoid name clashes.

Some people have a strong conviction that multiple inheritance is a bad thing. It can certainly be misused, and nothing in Python forces you to use it. After being involved with a number of object-oriented project developments in industry since starting with one of the first versions of C++ in 1987, I’ve concluded that one of the biggest dangers seems to be creating inheritance hierarchies that are too deep. Multiple inheritance can at times be used to help keep this from happening. That issue is beyond the scope of this book. The example we use here only illustrates how multiple inheritance works in Python and doesn’t attempt to explain the use cases—for example, as in mixin or addin classes—for it.

Supplement :
* [Python 學習筆記] 進階議題 : 特性控制 (property() 函式)
* [Python 學習筆記] 函式、類別與模組 : 類別 (建構、初始與消滅)
* [Python 學習筆記] 函式、類別與模組 : 類別 (繼承)

程式扎記

標籤

2012年2月26日星期日

[Quick Python] 15. Classes and object-oriented programming - Part 2

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

檢舉濫用情形

學習筆記

標籤

2012年2月26日 星期日