程式扎記: [Quick Python] 21. Testing your code made easy(-er)

標籤

2012年4月1日 星期日

[Quick Python] 21. Testing your code made easy(-er)

Preface : 
The problem with writing code is that you’re never sure you’ve got it right. Every time you turn around, bugs crop up, and what’s worse, fixing those bugs is likely to create more bugs. Fortunately, Python encourages readable code, which helps in debugging; but we still need all the help we can get in maintaining our code. 

This chapter covers 
■ Testing your code
■ Debugging with the assert statement
■ Using Python’s debug variable
■ Testing using docstrings
■ Creating and using unit tests

21.1 Why you need to have tests 
Almost all code needs maintenance. Sometimes it requires minor bug fixes, but other times it needs major changes or additions, or even a complete redesign. The more you change your code, the more likely you are to inadvertently introduce new problems or mistakes. What you need is a way to make sure you don’t create new bugs when you fix the old ones and that everything still works when you redesign and refactor and improve your code. You need ways to verify that what used to work still works. You need tests. 

21.2 The assert statement 
The quickest way to put a test into your code is with the assert statement. An assert statement is a simple way of putting a conditional in your code that will raise an exception if an expression isn’t true. That makes it an ideal watchdog for situations where a particular precondition must always be true for the code to function correctly. For example, see the file below : 
- assert_test.py :
  1. def example(param):  
  2.     """param must be greater than 0!"""  
  3.     assert param > 0  # Check value of param  
  4.     # do stuff here  

Then use it : 
>>> import assert_test
>>> assert_test.example(0)
Traceback (most recent call last):
File "", line 1, in
File "assert_test.py", line 3, in example
assert param > 0 # Check value of param
AssertionError

- Python’s __debug__ variable 
Although assert statements are a bit more streamlined, by themselves they’re conditionals that raise exceptions when their expressions are false. It’s a fair concern that using assert statements generously will leave code littered with extra conditional statements that will impact its performance. The assert statement relies on a built-in variable in Python, __debug__, which is True by default. That means the assert statement we used previously in assert_test.py is equivalent to : 
  1. if __debug__:  
  2.     if not param > 0:  
  3.         raise AssertionError  
But if the __debug__ variable is False, no code will be generated at all for assert statements. The catch is that __debug__ can’t be directly assigned : 
>>> __debug__
True
>>> __debug__ = False
File "", line 1
SyntaxError: assignment to keyword

To turn off the __debug__ variable, either you need to have the PYTHONOPTIMIZE environment variable set, or you need to run Python with the –O option. When the__debug__ variable has been turned off with the –O parameter, the previous test using assert_test.py no longer gives an error. By using assert (and the __debug__variable), you can have several checks in place as you develop and test your code. And then, by giving Python a single option, you can have all of that testing code disappear for production runs but still be available the next time you need to debug. 

21.3 Tests in docstrings: doctests 
Using assert statements is simple and relatively easy but also rather limited. Although assert statements can check specific spots in your code, they give no support for creating more complete suites of tests that can be run to test entire modules. Python has an easy way to test your code that uses the docstrings you should already be including in your interactive sessions, testing some code that’s cut and pasted into the docstring for that code. 

For example, let’s return to the TypedList class we created in the last chapter. As you may recall, the idea was to create a list-like class that allowed only a single type of item. Because we had to add __getitem__ and __setitem__ special methods, it would be good to test them to make sure that they work as expected when we use [] to access items. Using a Python shell, we can do something like this : 
>>> from typedlist import TypedList
>>> a_typed_list = TypedList(1, [1, 2, 3])
>>> a_typed_list[1] == 2
True
>>> a_typed_list[1] = 3
>>> a_typed_list[1]
3

This interactive session verifies that with an index of 1, both __getitem__ and __setitem__ work correctly to access the second item of a TypedList. To make this session into a doctest, we can copy and paste it into the docstring of the typedlist module and add code to run the test if the module is executed directly as a script. See below : 
- typedlist_doctest.py :
  1. """ a list that only allows items of a single type  
  2. any text (like this) that isn't in shell format is ignored by doctest  
  3. >>> from typedlist_doctest import TypedList  
  4. >>> a_typed_list = TypedList(1, [123])  
  5. >>> a_typed_list[1] == 3  
  6. True  
  7. >>> a_typed_list[1] = 3  
  8. >>> a_typed_list[1]  
  9. 3  
  10. >>>  
  11. """  
  12. class TypedList:  
  13.     def __init__(self, example_element, initial_list=[]):  
  14.         self.type = type(example_element)  
  15.         if not isinstance(initial_list, list):  
  16.             raise TypeError("Second argument of TypedList must "  
  17.                             "be a list.")  
  18.         for element in initial_list:  
  19.             self.__check(element)  
  20.             self.elements = initial_list[:]  
  21.     def __check(self, element):  
  22.         if type(element) != self.type:  
  23.             raise TypeError("Attempted to add an element of "  
  24.                             "incorrect type to a typed list.")  
  25.     def __setitem__(self, i, element):  
  26.         self.__check(element)  
  27.         self.elements[i] = element  
  28.     def __getitem__(self, i):  
  29.         return self.elements[i]  
  30.   
  31. if __name__ == "__main__":  
  32.     import doctest  
  33.     doctest.testmod()  

If we run typedlist_doctest.py from a command prompt, by default it prints nothing if all the tests pass. But if a test fails, it’s reported in detail. For example, let’s change the first access of a_typed_list[1] to expect a 3 instead of a 2 : 
>>> a_typed_list[1] == 3
True

Now the test will fail, and doctest will report that clearly : 
 

If you need a full report of all tests, both failing and passing, you can make doctest give a verbose report by adding the –v switch after the filename on the command line. 

- Avoiding doctest traps 
Because doctests expect a character-by-character match for a successful test, you’ll sometimes find tests failing unexpectedly. In particular, dictionaries aren’t guaranteed to print in a particular order. If you had a test like this : 
>>> my_dict = {'one': 1, 'two': 2}
>>> my_dict
{'one': 1, 'two': 2}

the two items could conceivably print in either order, causing the test to fail unpredictably. In cases like this, a direct comparison is more reliable : 
>>> my_dict == {'one': 1, 'two': 2}
True
>>> my_dict == {'two': 2, 'one': 1}
True

Similarly, printing object addresses will also cause failures, because it’s unlikely that an object will have the same address for two different runs of the test. It’s also important to note that if you want blank lines to be considered part of the output, you need to indicate them with a line containing just , because a blank line is normally a signal to doctest of the end of output. 

Finally, if you use the \ character, either to escape a character or to continue a line in a doctest, you need to make sure that the docstring is a raw string. Prepending anr to the docstring will prevent the \ from being interpreted as part of the string. 

- Tweaking doctests with directives 
Several directives can also tweak the way lines are handled. The most commonly used of these directives are NORMALIZE_WHITESPACE and ELLIPSIS. The former treats all sequences of whitespace as equal, so that differently spaced sequences of items, or even sequences with line breaks, still pass the test. Similarly, ELLIPSISsignals that a sequence of ... will match any substring in the output. Using ELLIPSIS can alleviate problems like those mentioned earlier in printing object addresses, if you need to include data that changes from run to run in your doctest. You employ directives by adding them to a # doctest: comment following the test, with a + to activate and a  to deactivate them. Directives apply only to a single example, and you can combine multiple directives, either on the same line or on multiple lines : 
>>> print([1, 2, 3, 4]) # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE

This line matches all of the following : 
 

Additional directives control the style of output, the treatment of exception tracebacks, and so on, and it’s also possible to add your own directives and subclass doctest internals to change the response to them. 

- Pros and cons of doctests 
Doctests have two big advantages going for them: simplicity and ease of use. These virtues make doctests a good thing to use in your code, because simple, easy tests are more likely to be created, used, and maintained. In addition, having the test readily visible and accessible as part of the code is a help. If you can see it and edit it easily, it’s more likely that you’ll keep it current and use it often. I might add that doctests are a “Pythonic” method of testing. 

On the other hand, doctests aren’t for everyone or for every situation. If you need to do many tests, docstrings will add considerable bulk to your code files unless you move them to separate files, in which case you’ve lost one of the advantages of using doctests. In addition, some aspects of your code may be more awkward to test in the interpreter. There’s some debate about how far you can take doctests as a testing mechanism. The Zope 3 project, for example, uses doctests extensively, whereas most projects of similar size use unit tests. It finally comes down to what works best for you and your project. 

21.4 Using unit tests to test everything, every time 
In addition to doctests, Python also has a full-blown unit test library as part of the standard library. The unittest module (originally named PyUnit) is modeled somewhat on the JUnit library widely used in Java. You can use unittest to create a comprehensive suite of tests for any project. Python itself uses unit tests to test its more than 110 modules, including unittest. It’s not the intent of this chapter to tell you why you should use unittest or exactly what you should test but to give you a quick example of how to create and run a unittest suite. 

The two basic classes you use to create unit tests are TestCase and TestSuite. The former contains the individual tests, and the latter is used to aggregate tests that should be run together. 

- Setting up and running a single test case 
You create a test, or group of tests, by subclassing TestCase and adding each test as a method. To make the tests, you can either use assert or use one of the many variations on assert that are part of the TestCase class. In addition to adding the tests themselves as methods to the subclass, you can override both the setUp() andtearDown() methods to handle creating and disposing any objects or conditions needed for the test. 

It will be easier to see how test creation works by following a simple example. In the last chapter, we created a TypedList class that ensures that all of its items are of the same type. Let’s create a simple test case to make sure the __getitem__ method returns the correct value : 
- testtypedlist.py :
  1. import unittest  
  2. from typedlist import TypedList  
  3.   
  4. class TestTypedList(unittest.TestCase):  
  5.     def setUp(self):  
  6.         self.a_typedlist = TypedList(1, [123])  
  7.     def testGetItem(self):  
  8.         self.assertEqual(self.a_typedlist[1], 2)  
  9.     def testSetItem(self):  
  10.         self.a_typedlist[1] = 3  
  11.         self.assertEqual(self.a_typedlist[1], 3)  
  12. if __name__ == '__main__':  
  13.     unittest.main()  

In these tests, we use the assertEqual() method of TestCase. As the name implies, assertEqual tests for the equality of two values. There are number of test asserts inTestCase, but you don’t absolutely have to use them. You can, for example, also use the regular assert statement, if you want. The problem is that if the __debug__variable is set to false, the assertion won’t be tested, and your testing won’t occur. Therefore, it’s probably wisest always to use the methods in TestCase. Table 21.1 lists the main forms of those methods : 
 

This is a summary of the most common methods. There are variations with the opposite names: failUnlessfailUnlessEqual, and so on; see the standard documentation for a complete list. 

- Running the test 
To run the test from the command line, we can add a call to the main method of the test case : 
  1. if __name__ == '__main__':  
  2.     unittest.main()  
The output from running this is something like the following : 
 

On the other hand, if a test fails, we get a fuller report. Let’s assume that we changed the value of the __getitem__ test so that it would fail : 
 
(Each failure is reported clearly, and the total number of failures is also reported.

- Running multiple tests 
It’s also fairly easy to aggregate various test cases into a unified test suite that can be run with a single command. This is done most easily by using the TestSuite,TestLoader, and TextTestRunner classes. Although the TestSuite class can be subclassed and customized, and test cases can be added to a test suite instance manually, for most applications it’s easier to use the module’s default instances of TestLoader and TextTestRunner to create and run a test suite as follows : 
 

The previous code uses the module’s instance of defaultTestLoader to add all tests of the type TestTypedList to the test suite called suite. Then, the module’s defaultTextTestRunner instance runs the suite of tests with a verbosity level of 2. 

There are several variations on how test cases can be detected and loaded. For example, to load all tests from the module testtypedlist.py, you can useloadTestsFromName('testtypedlist'), which loads all tests in classes derived from TestCase in that module. 

- Unit tests vs. doctests 
Unit tests have a different approach than doctests. Although doctests are by nature intended (if not required) to be interleaved with the code they test, unit tests are meant to be separate from the tested code. This makes unit tests a bit less transparent and convenient in smaller projects, but they also have benefits. Having the tests separate from the code means you can develop an extensive suite of tests without increasing the size of the code files and burying the working code under a mass of test code. It also means that the tests don’t need to be distributed with the code, although in many projects they are.

沒有留言:

張貼留言

網誌存檔

關於我自己

我的相片
Where there is a will, there is a way!