Modules make reusing small chunks of code easy. The problem comes when the project grows and the code you want to reload outgrows, either physically or logically, what would fit into a single file. If having one giant module file is an unsatisfactory solution, having a host of little unconnected modules isn’t much better. The answer to this problem is to combine related modules into a package. This chapter covers :
What is a package?
A module is a file containing code. A module defines a group of usually related Python functions or other objects. The name of the module is derived from the name of the file. When you understand modules, packages are easy, because a package is a directory containing code and possibly further subdirectories. A package contains a group of usually related code files (modules). The name of the package is derived from the name of the main package directory.
Packages are a natural extension of the module concept and are designed to handle very large projects. Just as modules group related functions, classes, and variables, packages group related modules.
A first example :
To see how this might work in practice, let’s sketch a design layout for a type of project that by nature is very large—a generalized mathematics package, along the lines of Mathematica, Maple, or MATLAB. Maple, for example, consists of thousands of files, and some sort of hierarchical structure is vital to keeping such a project ordered. We’ll call our project as a whole mathproj.
We can organize such a project in many ways, but a reasonable design splits the project into two parts: ui, consisting of the user interface elements, and comp, the computational elements. Within comp, it may make sense to further segment the computational aspect into symbolic (real and complex symbolic computation, such as high school algebra) and numeric (real and complex numerical computation, such as numerical integration). It may then make sense to have a constants.py file in both the symbolic and numeric parts of the project.
The constants.py file in the numeric part of the project defines pi as :
whereas the constants.py file in the symbolic part of the project defines pi as :
- class PiClass:
- def __str__(self):
- return "PI"
- pi = PiClass()
The symbolic constants.py file defines pi as an abstract Python object, the sole instance of the PiClass class. As the system is developed, various operations can be implemented in this class, which return symbolic rather than numeric results.
There is a natural mapping from this design structure to a directory structure. The top-level directory of the project, called mathproj, contains subdirectories ui and comp; comp in turn contains subdirectories symbolic and numeric; and each of symbolic and numeric contains its own constants.py file.
Given this directory structure, and assuming that the root mathproj directory is installed somewhere in the Python search path, Python code both inside and outside themathproj package can access the two variants of pi as mathproj.symbolic.constants.pi and mathproj.numeric.constants.pi. In other words, the Python name for an item in the package is a reflection of the directory pathname to the file containing that item.
That’s what packages are all about. They’re ways of organizing very large collections of Python code into coherent wholes, by allowing the code to be split among different files and directories and imposing a module/submodule naming scheme based on the directory structure of the package files. Unfortunately, all isn’t this simple in practice because details intrude to make their use more complex than their theory. The practical aspects of packages are the basis for the remainder of this chapter.
A concrete example :
The rest of this chapter will use a running example to illustrate the inner workings of the package mechanism (see figure 18.2). Filenames and paths are shown in plain text, to avoid confusion as to whether we’re talking about a file/directory or the module/package defined by that file/directory. The files we’ll be using in our example package are shown in listings 18.1 through 18.6 :
For the purposes of the examples in this chapter, we’ll assume that you’ve created these files in a mathproj directory that’s on the Python search path. (It’s sufficient to ensure that the current working directory for Python is the directory containing mathproj when executing these examples.)
- Basic use of the mathproj package
Before getting into the details of packages, let’s look at accessing items contained in the mathproj package. Start up a new Python shell, and do the following :
If all goes well, you should get another input prompt and no error messages. As well, the message "Hello from mathproj init" should be printed to the screen, by code in the mathproj/__init__.py file. We’ll talk more about __init__.py files in a bit; for now, all you need to know is that they’re run automatically whenever a package is first loaded.
The mathproj/__init__.py file assigns 1.03 to the variable version. version is in the scope of the mathproj package namespace, and after it’s created, you can see it viamathproj, even from outside the mathproj/__init__.py file :
In use, packages can look a lot like modules; they can provide access to objects defined within them via attributes. This isn’t surprising, because packages are a generalization of modules.
- Loading subpackages and submodules
Now, let’s start looking at how the various files defined in the mathproj package interact with one another. We’ll do this by invoking the function g defined in the filemathproj/comp/numeric/n1.py. The first obvious question is, has this module been loaded? We have already loaded mathproj, but what about its subpackage? Let’s see if it’s known to Python :
In other words, loading the top-level module of a package isn’t enough to load all the submodules. This is in keeping with Python’s philosophy that it shouldn’t do things behind your back. Clarity is more important than conciseness. This is simple enough to overcome. We import the module of interest and then execute the function g in that module :
Notice, however, that the lines beginning with Hello are printed out as a side effect of loading mathproj.comp.numeric.n1. These two lines are printed out by printstatements in the __init__.py files in mathproj/comp and mathproj/comp/numeric. In other words, before Python can import mathproj.comp.numeric.n1, it first has to import mathproj.comp and then mathproj.comp.numeric. Whenever a package is first imported, its associated __init__.py file is executed, resulting in the Hello lines. To confirm that both mathproj.comp and mathproj.comp.numeric are imported as part of the process of importing mathproj.comp.numeric.n1, we can check to see that mathproj.comp and mathproj.comp.numeric are now known to the Python session :
- import statements within packages
Files within a package don’t automatically have access to objects defined in other files in the same package. As in outside modules, you must use import statements to explicitly access objects from other package files. To see how this works in practice, look back at the n1 subpackage. The code contained in n1.py is :
- from mathproj import version
- from mathproj.comp import c1
- from mathproj.comp.numeric.n2 import h
- def g():
- print "version is", version
- print h()
as the third line in n1.py, and it works fine.
You can add more dots to move up more levels in the package hierarchy, and you can add module names. Instead of :
we could also have written the imports of n1.py as :
Relative imports can be handy and quick to type, but be aware that they’re relative to the module’s __name__ property. Therefore any module being executed as the main module, and thus having an __name__ of __main__, can’t use relative imports.
- __init__.py files in packages
You’ll have noticed that all the directories in our package—mathproj, mathproj/ comp, and mathproj/numeric—contain a file called __init__.py. An __init__.py file serves two purposes :
The __all__ attribute :
If you look back at the various __init__.py files defined in mathproj, you’ll notice that some of them define an attribute called __all__. This has to do with execution of statements of the form from ... import *, and it requires explanation.
Generally speaking, we would hope that if outside code executed the statement from mathproj import *, it would import all nonprivate names from mathproj. In practice, life is more difficult. The primary problem is that some operating systems have an ambiguous definition of case when it comes to filenames. Microsoft Windows 95/98 is particularly bad in this regard, but it isn’t the only villain. Because objects in packages can be defined by files or directories, this leads to ambiguity as to exactly under what name a subpackage might be imported. If we say from mathproj import *, will comp be imported as comp, Comp, or COMP? If we were to rely only on the name as reported by the operating system, the results might be unpredictable.
There’s no good solution to this. It’s an inherent problem caused by poor OS design. As the best possible fix, the __all__ attribute was introduced. If present in an __init__.py file, __all__ should give a list of strings, defining those names that are to be imported when a from ... import * is executed on that particular package. If__all__ isn’t present, then from ... import * on the given package does nothing. Because case in a text file is always meaningful, the names under which objects are imported isn’t ambiguous, and if the OS thinks that comp is the same as COMP, that’s its problem.
To see this in action, fire up Python again, and try the following :
The __all__ attribute in mathproj/__init__.py contains a single entry, comp, and the import statement imports only comp. It’s easy enough to check that comp is now known to the Python session :
But note that there’s no recursive importing of names with a from ... import * statement. The __all__ attribute for the comp package contains c1, but c1 isn’t magically loaded by our from mathproj import * statement :
To insert names from mathproj.comp we must, again, do an explicit import :
Proper use of packages :
Most of your packages shouldn’t be as structurally complex as these examples imply. The package mechanism allows wide latitude in the complexity and nesting of your package design. It’s obvious that very complex packages can be built, but it isn’t obvious that they should be built.
Here are a couple of suggestions that are appropriate in most circumstances :
Supplement :
* [Python 學習筆記] 函式、類別與模組 : 模組 (匯入套件)
沒有留言:
張貼留言