Chapter 11: Regex Flashcards

1
Q

Code that works when the input data is in a particular format but is prone to breakage if there is some deviation from the correct format. Aka easily broken.

A

brittle code

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

regex + and * characters expand outward to match the LARGEST possible string

A

greedy matching

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

A command available in most Unix systems that searches through text files looking for lines that match regular expressions.

A

grep
General Regular Expression Parser

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

A language for expressing more complex search strings. May contain special characters that indicate that a search only matches at the beginning or end of a line or many other similar capabilities.

A

regular expression
(regex)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A special character that matches any character. In regular expressions it’s the period.

A

wild card

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

regular expression module

A

re
import re

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

regex method that finds a specified regular expression in text, returns match object

A

re.search(regex, search string)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

regex that matches beginning of line

A

’^’
re.search(‘^From:’, line)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

regex that matches any character (a wildcard)

A

. (period/full stop)

re.search(‘F..m’, line) = From, Flam, F#om, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

regex that applies to the immediately preceding character(s) and indicates to match zero or more times.

A

*

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

regex that applies to the immediately preceding character(s) and indicates to match one or more times.

A

+

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

regex method that returns a list of substring(s) that matches a regular expression

A

re.findall(substring, search string)
For loop: [‘substring1’][‘substring2’]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

regex that matches non-whitespace character

A

\S

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

regex format to accept specific characters

A

’[]’
Set notation

re.findall(‘[a-zA-Z0-9]’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

regex format to match an actual period

A

[.] or \.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When added to a regular expression, they are ignored for the purpose of matching, but allow you to extract a particular subset of the matched string rather than the whole string when using findall().

A

()

re.findall(‘substring(part I want)’, string)

17
Q

technique to insert regex as literal character

A

backslash
\$ = ‘$’

18
Q

regex that anchors to end of line

A

$

19
Q

regex that matches a whitespace character

A

\s

20
Q

regex that applies to the immediately preceding character(s) and indicates to match zero or more times in “non-greedy mode”.

A

*?

21
Q

regex that applies to the immediately preceding character(s) and indicates to match one or more times in “non-greedy mode”.

A

+?

22
Q

regex that applies to the immediately preceding character(s) and indicates to match zero or one time.

A

?

23
Q

regex that applies to the immediately preceding regular expression and indicates to match zero or one time in “non-greedy mode”.

A

??

24
Q

regex that matches a single character as long as that character is in the specified set. In this example, it would match “a”, “e”, “i”, “o”, “u”, or “-“ but no other characters.

A

[aeiou-] or [-aeiou]

25
Q

You can specify ranges of characters using the minus sign. This example is a single character that must be a lowercase letter or a digit.

A

[a-z0-9]

26
Q

When the first character in the set notation is a caret, it inverts the logic. This example matches a single character that is anything other than an uppercase or lowercase letter.

A

[^A-Za-z]

27
Q

regex that asserts where a word begins or ends. This means that r’\bat\b’ matches ‘at’, ‘at.’, ‘(at)’, and ‘as at ay’ but not ‘attempt’ or ‘atlas’.

A

\b

28
Q

regex that asserts where a word does NOT begin or end. This means that r’at\B’ matches ‘athens’, ‘atom’, ‘attorney’, but not ‘at’, ‘at.’, or ‘at!’.

A

\B

29
Q

regex that matches any decimal digit;
equivalent to the set [0-9].

A

\d

30
Q

regex that matches any non-digit character;
equivalent to the set [^0-9].

A

\D

31
Q

In Unix/Linux, command-line program similar to the search() function

A

Generalized Regular Expression Parser
(grep)

$ grep ‘^From:’ mbox.short.txt

32
Q

Unix linux regex for non-blank character

A

[^ ]

33
Q

regex + and * characters expand outward to match the SMALLEST possible string

A

non-greedy matching

34
Q

Specifies that exactly m copies of the
previous regular expression should be matched; fewer matches cause the entire regular expression not to match

A

{m}

35
Q

Causes the resulting regular expression to greedily match from m to n repetitions of the preceding regular expression.

A

{m,n}

36
Q

Causes the resulting regular expression to non-greedily match from m to n repetitions of the preceding regular expression

A

{m,n}?

37
Q

Creates a regular expression that will match either A or B.
This operation is never greedy

A

|

cat|dog = cat or dog

38
Q

Matches the contents of the group of the same number. Groups are numbered starting from 1.
For example, (.+) \1 matches ‘the the’ or ‘55 55’, but not ‘thethe’ (note the space after the group).

(abc)\1matches abcabc. In which,(abc)is a capturing group, and \1is a backreference that matches the same text as captured by the capturing group, so,\1 matches abc too.

A

\number