程式扎記: [Quick Python] 5. Lists, tuples, and sets

Preface :
In this chapter, we’ll discuss the two major Python sequence types: lists and tuples. At first, lists may remind you of arrays in many other languages, but don’t be
fooled—lists are a good deal more flexible and powerful than plain arrays. This chapter also discusses a newer Python collection type: sets. Sets are useful when an object’s membership in the collection, as opposed to its position, is important.

Tuples are like lists that can’t be modified—you can think of them as a restricted type of list or as a basic record type. We’ll discuss why we need such a restricted data type later in the chapter. Most of the chapter is devoted to lists, because if you understand lists, you pretty much understand tuples. The last part of the chapter discusses the differences between lists and tuples, in both functional and design terms. So this chapter covers :

* Manipulating lists and list indices
* Modifying lists
* Sorting
* Using common list operations
* Handling nested lists and deep copies
* Using tuples
* Creating and using sets

Lists are like arrays :
A list in Python is much the same thing as an array in Java or C or any other language. It’s an ordered collection of objects. You create a listd by enclosing a comma separated list of elements in square brackets, like so :

# This assigns a three-element list to x
x = [1, 2, 3]

Note that you don’t have to worry about declaring the list or fixing its size ahead of time. This line creates the list as well as assigns it, and a list automatically grows or shrinks in size as needed. But if you do want to use array like in C or Java, refer to below text :

Unlike lists in many other languages, Python lists can contain different types of elements; a list element can be any Python object. Probably the most basic built-in list function is the len function, which returns the number of elements in a list :

(Note that the len function doesn’t count the items in the inner, nested list.)

List indices :
Elements can be extracted from a Python list using a notation like C’s array indexing. Like C and many other languages, Python starts counting from 0; asking for element 0 returns the first element of the list, asking for element 1 returns the second element, and so forth. Here are a few examples :

>>> x = ["first", "second", "third", "fourth"]
>>> x[0]
'first'
>>> x[2]
'third'

But Python indexing is more flexible than C indexing; if indices are negative numbers, they indicate positions counting from the end of the list, with –1 being the last position in the list, –2 being the second-to-last position, and so forth. Continuing with the same list x, we can do the following :

>>> a = x[-1]
>>> a
'fourth'
>>> x[-2]
'third'

Python can extract or assign to an entire sublist at once, an operation known as slicing. Instead of entering list[index] to extract the item just after index, enterlist[index1:index2] to extract all items including index1 and up to (but not including) index2 into a new list. Here are some examples :

It may seem reasonable that if the second index indicates a position in the list before the first index, this would return the elements between those indices in reverse order, but this isn’t what happens. Instead, this return: an empty list :

>>> x[-1:2] # -1 means 3 here. So equals x[3:2] which first index is after second index.
[]

When slicing a list, it’s also possible to leave out index1 or index2. Leaving out index1 means “go from the beginning of the list,” and leaving out index2 means “go to the end of the list” :

>>> x[:3] # From 0~2
['first', 'second', 'third']
>>> x[2:] # From 2~3
['third', 'fourth']

Omitting both indices makes a new list that goes from the beginning to the end of the original list; that is, it copies the list. This is useful when you wish to make a copy that you can modify, without affecting the original list :

Modifying lists :
You can use list index notation to modify a list as well as to extract an element from it. Put the index on the left side of the assignment operator :

>>> x = [1, 2, 3, 4]
>>> x[1] = "two"
>>> x
[1, 'two', 3, 4]

Slice notation can be used here too. Saying something like lista[index1:index2] = listb causes all elements of lista between index1 and index2 to be replaced with the elements in listb. listb can have more or fewer elements than are removed from lista, in which case the length of lista will be altered. You can use slice assignment to do a number of different things, as shown here :

Appending a single element to a list is such a common operation that there’s a special append method to do it :

>>> x = [1, 2, 3]
>>> x.append("four")
>>> x
[1, 2, 3, 'four']

One problem can occur if you try to append one list to another. The list gets appended as a single element of the main list :

The extend method is like the append method, except that it allows you to add one list to another :

There is also a special insert method to insert new list elements between two existing elements or at the front of the list. insert is used as a method of lists and takes two additional arguments; the first is the index position in the list where the new element should be inserted, and the second is the new element itself :

insert understands list indices as discussed in the section on slice notation, but for most uses it’s easiest to think of list.insert(n, elem) as meaning insert elem just before the nth element of list. insert is just a convenience method. Anything that can be done with insert can also be done using slice assignment; that is, list.insert(n, elem) is the same thing as list[n:n] = [elem] when n is nonnegative. Using insert makes for somewhat more readable code, and insert even handles negative indices :

>>> x = [1, 2, 3]
>>> x.insert(-1, "hello") # insert value before last value (-1)
>>> print(x)
[1, 2, 'hello', 3]

The del statement is the preferred method of deleting list items or slices :

>>> x = ['a', 2, 'c', 7, 9, 11]
>>> del x[1]
>>> x
['a', 'c', 7, 9, 11]
>>> del x[:2]
>>> x
[7, 9, 11]

In general, del list[n] does the same thing as list[n:n+1] = [], whereas del list[m:n] does the same thing as list[m:n] = [].

The remove method isn’t the converse of insert. Whereas insert inserts an element at a specified location, remove looks for the first instance of a given value in a list and removes that value from the list :

If remove can’t find anything to remove, it raises an error. You can catch this error using the exception-handling abilities of Python, or you can avoid the problem by usingin to check for the presence of something in a list before attempting to remove it.

The reverse method is a more specialized list modification method. It efficiently reverses a list in place :

>>> x = [1, 3, 5, 6, 7]
>>> x.reverse()
>>> x
[7, 6, 5, 3, 1]

Sorting lists :
Lists can be sorted using the built-in Python sort method :

>>> x = [3, 8, 4, 0, 2, 1]
>>> x.sort()
>>> x
[0, 1, 2, 3, 4, 8]

This does an in-place sort—that is, it changes the list being sorted. To sort a list without changing the original list, make a copy of it first :

>>> x = [2, 4, 1, 3]
>>> y = x[:]
>>> y.sort()
>>> y
[1, 2, 3, 4]
>>> x
[2, 4, 1, 3]

The sort method can sort just about anything, because Python can compare just about anything. But there is one caveat in sorting. The default key method used by sort requires that all items in the list be of comparable types. That means that using the sort method on a list containing both numbers and strings will raise an exception :

>>> x = [1, 2, 'hello', 3]
>>> x.sort()
Traceback (most recent call last):
File "", line 1, in
TypeError: unorderable types: str() < int()

On the other hand, we can sort a list of lists :

>>> x = [[3, 5], [2, 9], [2, 3], [4, 1], [3, 2]]
>>> x.sort()
>>> x
[[2, 3], [2, 9], [3, 2], [3, 5], [4, 1]]

According to the built-in Python rules for comparing complex objects, the sublists are sorted first by ascending first element and then by ascending second element. sort is even more flexible than this—it’s possible to use your own key function to determine how elements of a list are sorted.

- Custom sorting
To use custom sorting, you need to be able to define functions, something we haven’t talked about. In this section we’ll also use the fact that len(string) returns the number of characters in a string. String operations are discussed more fully in chapter 6.

By default, sort uses built-in Python comparison functions to determine ordering, which is satisfactory for most purposes. There will be times, though, when you want to sort a list in a way that doesn’t correspond to this default ordering. For example, let’s say we wish to sort a list of words by the number of characters in each word, in contrast to the lexicographic sort that would normally be carried out by Python.

To do this, write a function that will return the value, or key, that we want to sort on, and use it with the sort method. That function in the context of sort is a function that takes one argument and returns the key or value the sort function is to use. For our number-of-characters ordering, a suitable key function could be :

view plaincopy to clipboardprint?
def compare_num_of_chars(string1):  
    return len(string1)  

This key function is trivial. It passes the length of each string back to the sort method, rather than the strings themselves. After you define the key function, using it is a matter of passing it to the sort method using the key keyword. Because functions are Python objects, they can be passed around like any other Python object. Here’s a small program that illustrates the difference between a default sort and our custom sort :

The first list is in lexicographic order (with uppercase coming before lowercase), and the second list is ordered by ascending number of characters.

Custom sorting is very useful, but if performance is critical, it may be slower than the default. Usually this impact is minimal, but if the key function is particularly complex, the effect may be more than desired, especially for sorts involving hundreds of thousands or millions of elements.

One particular place to avoid custom sorts is where you want to sort a list in descending, rather than ascending, order. In this case, use the sort method’s reverseparameter set to True. If for some reason you don’t want to do that, it’s still better to sort the list normally and then use the reverse method to invert the order of the resulting list. These two operations together—the standard sort and the reverse—will still be much faster than a custom sort.

- The sorted() function
Lists have a built-in method to sort themselves, but other iterables in Python, like the keys of a dictionary, for example, don’t have a sort method. Python also has the builtin function sorted(), which returns a sorted list from any iterable. sorted() uses the same key and reverse parameters as the sort method :

>>> x = (4, 3, 1, 2)
>>> y = sorted(x)
>>> y
[1, 2, 3, 4]
>>> x # The original iterable object isn't change
(4, 3, 1, 2)

Other common list operations :
A number of other list methods are frequently useful, but they don’t fall into any specific category.

- List membership with the in operator
It’s easy to test if a value is in a list using the in operator, which returns a Boolean value. You can also use the converse, the not in operator :

- List concatenation with the + operator
To create a list by concatenating two existing lists, use the + (list concatenation) operator. This will leave the argument lists unchanged :

>>> z = [1, 2, 3] + [3, 4, 5]
>>> z
[1, 2, 3, 3, 4, 5]

- List initialization with the * operator
Use the * operator to produce a list of a given size, which is initialized to a given value. This is a common operation for working with large lists whose size is known ahead of time. Although you can use append to add elements and automatically expand the list as needed, you obtain greater efficiency by using * to correctly size the list at the start of the program :

>>> z = [None] * 4
>>> z
[None, None, None, None]

When used with lists in this manner, * (which in this context is called the list multiplication operator) replicates the given list the indicated number of times and joins all the copies to form a new list. This is the standard Python method for defining a list of a given size ahead of time. A list containing a single instance of None is commonly used in list multiplication, but the list can be anything :

>>> z = [3, 1] * 2
>>> z
[3, 1, 3, 1]

- List minimum or maximum with min and max
You can use min() and max() to find the smallest and largest elements in a list. You’ll probably use these mostly with numerical lists, but they can be used with lists containing any type of element. Trying to find the maximum or minimum object in a set of objects of different types causes an error if it doesn’t make sense to compare those types :

>>> min([3, 7, 0, -2, 11])
-2
>>> max([4, "Hello", [1, 2]])
Traceback (most recent call last):
File "", line 1, in
TypeError: unorderable types: str() > int()

- List search with index
If you wish to find where in a list a value can be found (rather than wanting to know only if the value is in the list), use the index method. It searches through a list looking for a list element equivalent to a given value and returns the position of that list element :

>>> x = [1, 3, "five", 7, -2]
>>> x.index("five")
2
>>> x.index(5)
Traceback (most recent call last):
File "", line 1, in
ValueError: 5 is not in list

Attempting to find the position of an element that doesn’t exist in the list at all raises an error, as shown here. This can be handled in the same manner as the analogous error that can occur with the remove method (that is, by testing the list with in before using index).

- List matches with count
count also searches through a list, looking for a given value, but it returns the number of times that value is found in the list rather than positional information :

>>> x = [1, 2, 2, 3, 5, 2, 5]
>>> x.count(2)
3
>>> x.count(5)
2
>>> x.count(4)
0

- Summary of list operations
You can see that lists are very powerful data structures, with possibilities that go far beyond plain old arrays. List operations are so important in Python programming that it’s worth laying them out for easy reference, as shown in table 5.1 :

Nested lists and deep copies :
This is another advanced topic that you may want to skip if you’re just learning the language. Lists can be nested. One application of this is to represent two-dimensional matrices. The members of these can be referred to using two-dimensional indices. Indices for these work as follows :

>>> m = [[0, 1, 2], [10, 11, 12], [20, 21, 22]]
>>> m[0]
[0, 1, 2]
>>> m[0][1]
1
>>> m[2][2]
22

This mechanism scales to higher dimensions in the manner you would expect. Most of the time, this is all you need to concern yourself with. But there is an issue
with nested lists that you may run into. This is the result of the combination of the way variables refer to objects and the fact that some objects (such as lists) can be modified (they’re mutable). An example is the best way to illustrate :

>>> nested = [0]
>>> original = [nested, 1]
>>> original
[[0], 1]

Figure 5.1 shows what this looks like :

The value in the nested list can now be changed using either the nested or the original variables :

>>> nested[0] = 'zero'
>>> original
[['zero'], 1]
>>> original[0][0] = 0
>>> nested
[0]
>>> original
[[0], 1]

But if nested is set to another list, the connection between them is broken :

>>> nested = [2] # Assign variable nested to another list
>>> original
[[0], 1] # The variable original still link to [0] not [2]

Figure 5.2 illustrates this :

Figure 5.2 The first item of the original list is still a nested list, but the nested variable refers to a different list.

You’ve seen that you can obtain a copy of a list by taking a full slice (that is, x[:]). You can also obtain a copy of a list using the + or * operator (for example, x + [] or x * 1). These are slightly less efficient than the slice method. All three create what is called a shallow copy of the list. This is probably what you want most of the time. But if your list has other lists nested in it, you may want to make a deep copy. You can do this with the deepcopy function of the copy module :

>>> original = [[0], 1]
>>> shallow = original[:]
>>> import copy
>>> deep = copy.deepcopy(original)
>>> original[0][0]=1
>>> shallow
[[1], 1]
>>> deep
[[0], 1]

See figure 5.3 for an illustration :

Figure 5.3 A shallow copy doesn’t copy nested lists.

The deep copy is independent of the original, and no change to it has any effect on the original list:

Supplement :
* [Quick Python] 5. Lists, tuples, and sets - Part 2
* [Python 學習筆記] 起步走 : 內建型態與操作 (串列)

程式扎記

標籤

2012年1月30日星期一

[Quick Python] 5. Lists, tuples, and sets - Part 1

1 則留言:

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

檢舉濫用情形

學習筆記

標籤

2012年1月30日 星期一