Python 正規表示式

match() vs search()

match()：只檢查字串起始位置是否符合。

search()：檢查字串任何位置是否符合。

# example 1
pattern = re.compile('me')
pattern.match('Bob is me.')  # no match
pattern.search('Bob is me.')  # match
# example 2
pattern = re.compile('Bob')
pattern.match('Bob is me.')  # match
pattern.search('Bob is me.')  # match

特殊符號

.：表示任何一字元，除了換行符號\n。

re.search('me.$', 'Bob is me~')  # match

^：檢查是否符合目標字串起始位置。

re.search('^Bob', 'Bob is me.')  # match
re.search('^is', 'Bob is me.')  # no match

$：檢查是否符合目標字串結尾位置。

re.search('me.$', 'Bob is me.')  # match
re.search('is$', 'Bob is me.')  # no match

*：表示*前方的pat出現 0 次或多次。

re.search('ab*', 'a')  # match
re.search('ab*', 'abbb')  # match

+：表示+前方的pat出現 1 次或多次。

re.search('ab+', 'a')  # no match
re.search('ab+', 'abbb')  # match

?：表示?前方的pat出現 0 次或 1 次。

re.match('ab?', 'a')  # match
re.match('ab?', 'ab')  # match

*?, +?, ??：在*+?之後加上?，可以讓 match 結果最小化。

re.match('<.*>', '<a> b <c>')
# result = <span=(0, 9), match='<a> b <c>'>
re.match('<.*?>', '<a> b <c>')
# result = <span=(0, 3), match='<a>'>

{m}：表示{m}前方的pat只能出現 m 次。

re.match('a{3}', 'aaa')  # match
re.match('a{3}', 'aa')  # no match

{m,n}：表示{m,n}前方的pat能出現 m~n 次。

re.match('a{2,3}b', 'aab')  # match
re.match('a{2,3}b', 'aaaab')  # no match

{m,}：表示{m,}前方的pat至少要出現 m 次。

re.match('a{2,}b', 'aaab')  # match
re.match('a{2,}b', 'ab')  # no match

\：跳脫字元，讓前面提到的$*當作一般字元使用。

re.match('\*+', '**')  # match
re.match('*+', '**')  # error

[]：表示某些字元的集合。

[abc]：表示 a 或 b 或 c。
[a-z]：表示所有小寫字母。
[0-5][0-9]：表示 00 到 59。
[0-9A-Za-z]：表示所有大小寫字母和數字。
[+?]：在集合中的特殊符號為一般字元，此範例表示+或?。
[^A-Z]：集合的開頭字元為^，表示非屬於的集合，此範例表示非大寫字母都符合。

|：當有多個pat使用|隔開，表示或的意思，會由左而右開始檢查，只要符合pat就會停止。

# the result of two examples are matching 
re.match('[^A]|[^B]', 'B')  # [^A] match。
re.match('[^B]|[^A]', 'B')  # [^B] no match, but [^A] match。

Regular Expression 1

Python 正規表示式

match() vs search()

特殊符號

results matching ""

No results matching ""