程式扎記: [Python Std Library] Built-in Types : Sequence Types

翻譯自這裡
Sequence Types — str, unicode, list, tuple, bytearray, buffer, xrange
sequence types 支援以下的操作. in 與 not int 運算元可以用來判斷某個物件是否存在 sequence 中 ; 或者使用 + 與 * 運算元來對 sequence 做數學操作, 底下是一個支援 sequence 物件操作的運算元列表, 更多的支援操作可以參考 Mutable Sequence Types :

除了以上操作, sequence object 也可以進行 comparison 的操作 :

Sequence types also support comparisons. In particular, tuples and lists are compared lexicographically by comparing corresponding elements. This means that to compare equal, every element must compare equal and the two sequences must be of the same type and have the same length. (For full details see Comparisonsin the language reference.)

註解 :
1. 當 s 是字串時 (含Unicode string), in 與 not in 是檢查字串 s 是否包含某個子字串 :

>>> str = "Hi, John"
>>> subStr = "John"
>>> if subStr in str:
... print("'{0}' has sub-string '{1}'!".format(str, subStr))
...
'Hi, John' has sub-string 'John'!

2. 請注意 shallow copy 的意函, 看看底下操作有沒有如你預期 :

>>> lists = [[]] * 3
>>> lists
[[], [], []]
>>> lists[0].append(3)
>>> lists
[[3], [3], [3]]

3. 如果這邊的 i, j 是負數, 代表從 sequence 的尾端數過來, 而 -0=0.
4. items 為 k 並界於 i<=k<j 間. 如果 i, j 大於 sequence 的長度, 取 len(s) ; 如果 i 被省略或是為 None, 取 0 ; 如果 j 被省略或為 None, 取 len(s) ; 當 i 大於 j 時取得空的 sequence.
5. 這邊說明有點長, 直接參考原文如下 :

The slice of s from i to j with step k is defined as the sequence of items with index x = i + n*k such that 0 <= n < (j-i)/k. In other words, the indices are i, i+k, i+2*k,i+3*k and so on, stopping when j is reached (but never including j). If i or j is greater than len(s), use len(s). If i or j are omitted or None, they become “end” values (which end depends on the sign of k). Note, k cannot be zero. If k is None, it is treated like 1.

6. 原文如下 :

CPython implementation detail: If s and t are both strings, some Python implementations such as CPython can usually perform an in-place optimization for assignments of the form s = s + t or s += t. When applicable, this optimization makes quadratic run-time much less likely. This optimization is both version and implementation dependent. For performance sensitive code, it is preferable to use the str.join() method which assures consistent linear concatenation performance across versions and implementations.

String Methods :
底下為 Python 字串物件上提供的方法, 有關 % 的用法 (String template) 可以參考 String Formatting Operations :
- str.capitalize()

Return a copy of the string with its first character capitalized and the rest lowercased.
For 8-bit strings, this method is locale-dependent.
>>> str = "abc"
>>> str.capitalize()
'Abc'

- str.center(width[, fillchar])

Return centered in a string of length width. Padding is done using the specified fillchar (default is a space).
Changed in version 2.4: Support for the fillchar argument.
>>> str = "John"
>>> str.center(10, '*') # 字串長度為 10, 以 '*' padding 並將 'John' 置中.
'***John***'

- str.count(sub[, start[, end]])

Return the number of non-overlapping occurrences of substring sub in the range [start, end]. Optional arguments start and end are interpreted as in slice notation.
>>> str = "aabc ab aba baab"
>>> str.count('ab', 1, 14) # 計算區間為 a[abc ab aba baa]b
3

- str.endswith(suffix[, start[, end]])

Return True if the string ends with the specified suffix, otherwise return False. suffix can also be a tuple of suffixes to look for. With optional start, test beginning at that position. With optional end, stop comparing at that position.
Changed in version 2.5: Accept tuples as suffix.
>>> str = "Hi,John."
>>> str.endswith('John', 0, 7) # Check [Hi,John].
True

- str.find(sub[, start[, end]])

Return the lowest index in the string where substring sub is found, such that sub is contained in the slice s[start:end]. Optional arguments start and end are interpreted as in slice notation. Return -1 if sub is not found.
>>> str = "Hi,Johnson."
>>> str.find('John', 1) # Seek 'John' from index=1
3
>>> str.find('John', 0) # 返回的位置與 start 無關!
3
>>> str.find('John', 4) # 如果找不到, 返回 -1
-1

- str.format(*args, **kwargs)

進行字串 Format 的操作, 類似 C/Java 的 printf(). 更多說明請參考 Format String Syntax.
>>> "The sum of 1 + 2 is {0}".format(1+2)
'The sum of 1 + 2 is 3'

- str.index(sub[, start[, end]])

與函式 find() 雷同, 但是這支函式會丟出 ValueError exception 如果 sub 不存在.
>>> str = "Hi,Johnson."
>>> str.index('John', 3)
3
>>> str.index('John', 4)
Traceback (most recent call last):
File "", line 1, in
ValueError: substring not found

- str.isalnum()

判斷字串是否由 0-9, a-z, A-Z 組成, 是則返回 True. 否則返回 False. 類似的方法還有 str.isalpha(), str.isdigit(), str.islower(), str.isspace(), str.istitle(), str.isupper()
>>> str = '123abcABC'
>>> str.isalnum()
True
>>> str = '123abcABC-'
>>> str.isalnum()
False

- str.join(iterable)

返回由 iterable 物件 iterable 所有的字串的合併 :
>>> str = ""
>>> str.join(["1", " 2", "3 ", "4"])
'1 23 4'
>>> str = "-"
>>> str.join(["1", " 2", "3 ", "4"]) # 以 '-' 分隔.
'1- 2-3 -4'

- str.lower() / str.upper()

將字串轉為大/小寫後返回新字串.
>>> str = "AbCdEf"
>>> str.lower()
'abcdef'
>>> str.upper()
'ABCDEF'

- str.lstrip([chars]) / str.rstrip([chars]) / str.strip([chars])

移除 prefix/suffix 指定的 chars , 如果 chars 沒有指定, 預設是 space :
>>> str=" abc.def.com "
>>> str.lstrip() # 移除前墜的空白
'abc.def.com '
>>> str.lstrip(' abc.') # 移除前墜字元包含 'a', 'b', 'c' 與空白
'def.com '
>>> str.rstrip() # 移除後墜的空白
' abc.def.com'
>>> str.rstrip(' com.') # 移除後墜字元 'c' , 'o', 'm', '.' 與空白
' abc.def'
>>> str.strip() # 移除兩邊的空白
'abc.def.com'
>>> str.strip(' abcom.') # 移除兩邊字元如 'a', 'b', 'c', 'o', 'm' 與空白
'def'

- str.partition(sep) / str.rpartition(sep)

Split the string at the first occurrence of sep, and return a 3-tuple containing the part before the separator, the separator itself, and the part after the separator. If the separator is not found, return a 3-tuple containing the string itself, followed by two empty strings.
>>> str="abc.com"
>>> str.partition('.')
('abc', '.', 'com')
>>> str.partition('-') # '-' 不存在字串中
('abc.com', '', '')

- str.replace(old, new[, count])

替換字串中 substring old 為 new. 若 optional 參數 count 有給, 則只替換前 count 的 substring.
>>> str="Hi, John. How about Johnson?. See u."
>>> str.replace('John', 'Peter')
'Hi, Peter. How about Peterson?. See u.'
>>> str.replace('John', 'Peter', 1) # 只替換第一個出現的 'John'
'Hi, Peter. How about Johnson?. See u.'

- str.rfind(sub[, start[, end]])

Return the highest index in the string where substring sub is found, such that sub is contained within s[start:end]. Optional arguments start and end are interpreted as in slice notation. Return -1 on failure.
>>> str="Hi, John. How about Johnson?. See u."
>>> str.rfind('John')
20
>>> str.rfind('John', 0, 10)
4

- str.rindex(sub[, start[, end]])

用法與 rfind() 類似, 唯一差別是如果 sub 不存在, 丟出 ValueError exception.

- str.split([sep[, maxsplit]]) / str.rsplit([sep[, maxsplit]])

根據提供的分隔字串 sep 進行字串的切割. 如果 optional maxsplit 有提供, 則最多切割 maxsplit 次 (也就是最多有 maxsplit+1 個元素). 要特別注意的是如果 sep 沒有提供或是為 None, 則連續的空白會被視為單一個分隔符 ; 且元素若為空白會被忽略.
>>> '1 2 3 '.split() # sep 沒有提供, 連續的空白只被視為單個分隔符. 所以元素中不會有空白.
['1', '2', '3']
>>> ' 1 2 3 '.split(None, 1) # 最多只切割一次, 所以最多有兩個元素.
['1', '2 3 ']
>>> ' 1 2 3 '.rsplit(None, 1) # 從右邊進行切割一次
[' 1 2', '3']
>>> ' 1 2 3 '.rsplit() # 如果沒有給定 maxsplit, str.rsplit() 與 str.split() 效果一致.
['1', '2', '3']
>>> ' 1 2 3 '.split(' ')
['', '1', '2', '', '3', '']

- str.splitlines([keepends])

Return a list of the lines in the string, breaking at line boundaries. Line breaks are not included in the resulting list unless keepends is given and true.
>>> str = "line1\nline2\nline3\n\n"
>>> str.splitlines()
['line1', 'line2', 'line3', '']

- str.startswith(prefix[, start[, end]])

判斷某個字串是否前綴為 substring prefix. 可以透過 optional 參數 start/end 來決定檢查的區間 str[start:end].
>>> "Hi, John".startswith('Hi')
True
>>> "Hi, John".startswith('John', 3) # 區間為 Hi,[ John]
False
>>> "Hi, John".startswith('John', 4) # 區間為 Hi, [John]
True

- str.swapcase()

將字串大寫轉小寫; 小寫轉大寫回返回新字串.
>>> "[AbC-dEf]".swapcase()
'[aBc-DeF]'

更多字串函數說明可以參考這裡.

Mutable Sequence Types :
List 與 bytearray 為 mutable sequence 類型物件支援以下操作, 讓你可以修改 Sequence 的內容. 字串與 tuple 是 immutable sequence 類型, 所以你無法修改已建立的字串與 tuple :

註解 :
1. t 必須與 slice 後的 sequence 有同樣的長度.

>>> list = [1, 2, 3, 4, 5]
>>> list[2:5:1] = [33, 44, 55]
>>> list
[1, 2, 33, 44, 55]

2. 看原文比較快 :

The C implementation of Python has historically accepted multiple parameters and implicitly joined them into a tuple; this no longer works in Python 2.0. Use of this misfeature has been deprecated since Python 1.4.

3. x 可以是任意 iterable 物件 :

>>> list = [1, 2, 3]
>>> list.extend((4, 5, 6)) # tuple (4, 5, 6) 是 iterable 物件
>>> list
[1, 2, 3, 4, 5, 6]
>>> list[len(list):len(list)] = "abc" # 字串也是 iterable 物件
>>> list
[1, 2, 3, 4, 5, 6, 'a', 'b', 'c']

4. 參考原文說明如下 :

Raises ValueError when x is not found in s. When a negative index is passed as the second or third parameter to the index() method, the list length is added, as for slice indices. If it is still negative, it is truncated to zero, as for slice indices.

5. 參考原文說明如下 :

When a negative index is passed as the first parameter to the insert() method, the list length is added, as for slice indices. If it is still negative, it is truncated to zero, as for slice indices.

6. pop() 只在 list 在 array types 支援. 而 optional argument i 預設為 -1, 所以每次都是最後一個元素被移除.

補充說明 :
* The Python Language Reference > 5. Expression
* Python 3.1 快速導覽 - 運算式 in運算

程式扎記

標籤

2012年3月7日星期三

[Python Std Library] Built-in Types : Sequence Types

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

檢舉濫用情形

學習筆記

標籤

2012年3月7日 星期三