PythonCheatsheet.Org Flashcards

Studying https://www.pythoncheatsheet.org/cheatsheet/basics

1
Q

Math Operators
From highest to lowest precedence:

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Augmented Assignment Operators

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Walrus Operator

A

The Walrus Operator allows assignment of variables within an expression while returning the value of the variable

The Walrus Operator, or Assignment Expression Operator was firstly introduced in 2018 via PEP 572, and then officially released with Python 3.8 in October 2019.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data Types

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Concatenation and Replication

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Variable naming rules

A
  1. it can only be on word
  2. it can only use letters, numbers, and the underscore (_) character
  3. It can’t begin with a number
  4. variables starting with an underscore (_) are considered as “unuseful”
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Comments

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The print() Function

A

The print() function writes the value of the argument(s) it is given. […] it handles multiple arguments, floating point-quantities, and strings.

Strings are printed without quotes, and a space is inserted between items, so you can format things nicely:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The end keyword

A

The keyword argument end can be used to avoid the newline after the output, or end the output with a different string:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The sep keyword

A

The keyword sep specify how to separate the objects, if there is more than one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The input() Function

A
  • This function takes the input from the user and converts it into a string:
  • input() can also set a default message without using print():
  • It is also possible to use formatted strings to avoid using .format:
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The len() Function

A

Evaluates to the integer value of the number of characters in a string, list, dictionary, etc.

*don’t use it to test emptiness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Should you use len() to test emptiness?

A

No, Test of emptiness of strings, lists, dictionaries, etc., should not use len, but prefer direct boolean evaluation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The str(), int(), and float() Functions

A

These functions allow you to change the type of variable. For example, you can transform from an integer or float to a string. Or from a string to an integer or float.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

abs()

A

Return the absolute value of a number.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

aiter()

A

Return an asynchronous iterator for an asynchronous iterable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

all()

A

Return True if all elements of the iterable are true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

any()

A

Return True if any element of the iterable is true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

ascii()

A

Return a string with a printable representation of an object.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

bin()

A

Convert an integer number to a binary string.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

bool()

A

Return a Boolean value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

breakpoint()

A

Drops you into the debugger at the call site.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

bytearray()

A

Return a new array of bytes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

bytes()

A

Return a new “bytes” object.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
callable()
Return `True` if the object argument is callable, False if not.
26
chr()
Return the string representing a character.
27
classmethod()
Transform a method into a class method.
28
compile()
Compile the source into a code or AST object.
29
complex()
Return a complex number with the value `real + imag*1j`.
30
delattr()
Deletes the named attribute, provided the object allows it.
31
dict()
Create a new dictionary
32
dir()
Return the list of names in the current local scope
33
divmod()
Return a pair of numbers consisting of their quotient and remainder.
34
enumerate()
Return an enumerate object.
35
eval()
Evaluates and executes an expression.
36
exec()
This function supports dynamic execution of Python code.
37
filter()
Construct an iterator from an iterable and returns true.
38
float()
Return a floating point number from a number or string
39
format()
Convert a value to a “formatted” representation
40
frozenset()
Return a new frozenset object
41
getattr()
Return the value of the named attribute of the object
42
globals()
return the dictionary implementing the current module namespace.
43
hasattr()
`True` if the string is the name of one of the object's attributes.
44
hash()
Return the hash value of the object
45
help()
Invoke the built-in help system
46
hex()
Convert an integer number to a lowercase hexadecimal string
47
id()
Return the "identity" of an object
48
input()
This function takes an input and converts it into a string
49
int()
Return an integer object constructed from a number or string
50
isinstance()
Return `True` if the object argument is an instance of an object
51
issubclass()
Return `True` if the class is a subclass of classinfo
52
iter()
Return an iterator object
53
len()
Return the length (the number of items) of an object.
54
list()
Rather than being a function, list is a mutable sequence type
55
locals()
Update and return a dictionary with the current local symbol table
56
map()
Return an iterator that applies function to every item of iterable
57
max()
Return the largest item in an iterable
58
min()
Return the smallest item in an iterable
59
next()
Retrieve the next item from the iterator.
60
object()
Return a new featureless object
61
oct()
Convert an integer to an octal string
62
open()
Open and file and return a corresponding file object
63
ord()
Return an integer representing the Unicode code point of a character
64
pow()
Return base to the power exp.
65
print()
Print objects to the text stream file
66
property()
Return a property attribute
67
repr()
Return a string containing a printable representation of an object
68
reversed()
Return a reverse iterator
69
round()
Return number rounded to ndigits precisions after the decimal point.
70
set()
Return an new `set` object
71
setattr()
This is the counterpart of `getattr()`
72
slice()
Return a sliced object representing a set of indices
73
sorted()
Return a new sorted list from the items in iterable
74
staticmethod()
Transform a method into a static method
75
str()
Return a str version of object
76
sum()
Sums start and the items of an iterable
77
super()
Return a proxy object that delegates method calls to a parent or sibling
78
tuple()
Rather than being a function, is actually an immutable sequence type
79
vars()
Return the `dict` attribute for any other object with a dict attribute
80
zip()
Iterate over several iterables in parallel
81
import()
This function is invoked by the import statement
82
Comparison Operators
evaluate to `True` or `False` depending on the values you give them.
83
Boolean operators
there are 3: `and`, `or`, and `not` The order of precedence, highest to lowest are `not`, `and`, and `or`
84
The `and` Operators Truth table
85
The `or` Operators Truth table
86
The `not` Operators Truth table
87
Can you mix boolean and comparison operators?
yes. ``` >>> 2 + 2 == 4 and not 2 + 2 == 5 and 2 * 2 == 2 + 2 True """ In the statement below 3 < 4 and 5 > 5 gets executed first evaluating to False Then 5 > 4 returns True so the results after True or False is True """ >>> 5 > 4 or 3 < 4 and 5 > 5 True """ Now the statement within parentheses gets executed first so True and False returns False. """ >>> (5 > 4 or 3 < 4) and 5 > 5 False ```
88
`if`, `elif`, `else`
The `if` statement evaluates an expression, and if that expression is `True`, it then executes the following indented code. The `else` statement executes only if the evaluation of the `if` and all the `elif` expressions are `False`. Only after the `if` statement expression is `False`, the `elif` statement is evaluated and executed. the `elif` and `else` parts are optional.
89
Ternary Conditional Operator
Many programming languages have a ternary operator, which define a conditional expression. The most common usage is to make a terse, simple conditional assignment statement. In other words, it offers one-line code to evaluate the first expression if the condition is true, and otherwise it evaluates the second expression. *Ternary operators can be chained.
90
Switch-Case Statement
In computer programming languages, a switch statement is a type of selection control mechanism used to allow the value of a variable or expression to change the control flow of program execution via search and map. The Switch-Case statements, or Structural Pattern Matching, was firstly introduced in 2020 via PEP 622, and then officially released with Python 3.10 in September 2022.
91
Matching single values
92
Matching with the or Pattern
93
Matching by the length of an Iterable
94
Matching default value:
95
Matching Builtin Classes
96
Guarding Match-Case Statements
97
`while` Loop Statements
The `while` statement is used for repeated execution as long as an expression is `True`:
98
`break` Statements
If the execution reaches a `break` statement, it immediately exits the `while` loop’s clause:
99
`continue` Statements
When the program execution reaches a `continue` statement, the program execution immediately jumps back to the start of the loop.
100
For loop
The `for` loop iterates over a `list`, `tuple`, `dictionary`, `set` or `string`:
101
The `range()` function
The `range()` function returns a sequence of numbers. It starts from `0, increments by 1`, and stops before a specified number. The `range()` function can also modify its 3 defaults arguments. The first two will be the `start` and `stop` values, and the third will be the `step` argument. The step is the amount that the variable is increased by after each iteration. You can even use a negative number for the `step` argument to make the `for loop` count down instead of up.
102
`For else` statement
This allows to specify a statement to execute in case of the full loop has been executed. Only useful when a `break` condition can occur in the loop:
103
Ending a Program with `sys.exit()`
`exit()` function allows exiting Python.
104
Function Arguments
A function can take `arguments` and `return values`: In the following example, the function say_hello receives the argument “name” and prints a greeting:
105
Keyword Arguments
To improve code readability, we should be as explicit as possible. We can achieve this in our functions by using Keyword Arguments:
106
Return Values
When creating a function using the `def` statement, you can specify what the return value should be with a `return` statement. A return statement consists of the following: The `return` keyword. The value or expression that the function should return.
107
Local and Global Scope
Code in the `global scope` cannot use any `local variables`. However, a `local scope` can access `global variables`. Code in a function’s `local scope` cannot use variables in any other `local scope`. You can use the same name for different variables if they are in different scopes. That is, there can be a `local variable` named `spam` and a `global variable` also named `spam`.
108
The `global` Statement
If you need to modify a `global variable` from within a function, use the `global` statement:
109
What are the four rules to tell whether a variable is in a local scope or global scope?
1. If a variable is being used in the `global scope` (that is, outside all functions), then it is always a `global variable`. 2. if there is a `global statement` for that variable in a function, it is a `global variable`. 3. Otherwise, if the variable is used in an `assignment statement` in the function, it is a `local variable.` 4. But if the variable is **not** used in an assignment statement, it is a `global variable`.
110
Lambda Functions
In Python, a `lambda` function is a single-line, anonymous function, which can have any number of arguments, but it can only have one expression. Lambda functions can only evaluate an expression, like a single line of code. `lambda` is a minimal function definition that can be used inside an expression. Unlike FunctionDef, body holds a single node.
111
Python Lists
Lists are one of the 4 data types in Python used to store collections of data. `['John', 'Peter', 'Debora', 'Charles']`
112
Getting list values with indexes | *lists
# 'table' ``` >>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> furniture[0] >>> furniture[1] # 'chair' >>> furniture[2] # 'rack' >>> furniture[3] # 'shelf' ```
113
Negative indexes | *lists
# 'shelf' ``` >>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> furniture[-1] >>> furniture[-3] # 'chair' >>> f'The {furniture[-1]} is bigger than the {furniture[-3]}' # 'The shelf is bigger than the chair' ```
114
Getting sublists with Slices | *lists
# ['table', 'chair', 'rack', 'shelf'] ``` >>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> furniture[0:4] >>> furniture[1:3] # ['chair', 'rack'] >>> furniture[0:-1] # ['table', 'chair', 'rack'] >>> furniture[:2] # ['table', 'chair'] >>> furniture[1:] # ['chair', 'rack', 'shelf'] >>> furniture[:] # ['table', 'chair', 'rack', 'shelf'] ```
115
`Slicing` the complete list will perform a copy: | *lists
# ['cat', 'bat', 'rat', 'elephant'] ``` >>> spam2 = spam[:] >>> spam.append('dog') >>> spam # ['cat', 'bat', 'rat', 'elephant', 'dog'] >>> spam2 # ['cat', 'bat', 'rat', 'elephant'] ```
116
Getting a list length with `len()`
# 4 ``` >>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> len(furniture) ```
117
Changing list values with `indexes`
# ['desk', 'chair', 'rack', 'shelf'] ``` >>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> furniture[0] = 'desk' >>> furniture >>> furniture[2] = furniture[1] >>> furniture # ['desk', 'chair', 'chair', 'shelf'] >>> furniture[-1] = 'bed' >>> furniture # ['desk', 'chair', 'chair', 'bed'] ```
118
Concatenation and Replication | *lists
# [1, 2, 3, 'A', 'B', 'C'] ``` >>> [1, 2, 3] + ['A', 'B', 'C'] >>> ['X', 'Y', 'Z'] * 3 # ['X', 'Y', 'Z', 'X', 'Y', 'Z', 'X', 'Y', 'Z'] >>> my_list = [1, 2, 3] >>> my_list = my_list + ['A', 'B', 'C'] >>> my_list # [1, 2, 3, 'A', 'B', 'C'] ```
119
Using `for` loops with Lists
# table ``` >>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> for item in furniture: ... print(item) # chair # rack # shelf ```
120
Getting the index in a loop with `enumerate()` | *lists
# index: 0 - item: table ``` >>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> for index, item in enumerate(furniture): ... print(f'index: {index} - item: {item}') # index: 1 - item: chair # index: 2 - item: rack # index: 3 - item: shelf ```
121
Loop in Multiple Lists with `zip()` | *lists
# The table costs $100 ``` >>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> price = [100, 50, 80, 40] >>> for item, amount in zip(furniture, price): ... print(f'The {item} costs ${amount}') # The chair costs $50 # The rack costs $80 # The shelf costs $40 ```
122
The `in` and `not in` operators | *lists
# True ``` >>> 'rack' in ['table', 'chair', 'rack', 'shelf'] >>> 'bed' in ['table', 'chair', 'rack', 'shelf'] # False >>> 'bed' not in furniture # True >>> 'rack' not in furniture # False ```
123
The Multiple Assignment Trick | *lists
# 'table' The multiple assignment trick is a shortcut that lets you assign multiple variables with the values in a list in one line of code. The multiple assignment trick can also be used to swap the values in two variables: So instead of doing this: ``` >>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> table = furniture[0] >>> chair = furniture[1] >>> rack = furniture[2] >>> shelf = furniture[3] ``` You could type this line of code: ``` >>> furniture = ['table', 'chair', 'rack', 'shelf'] >>> table, chair, rack, shelf = furniture >>> table >>> chair # 'chair' >>> rack # 'rack' >>> shelf # 'shelf' ``` The multiple assignment trick can also be used to swap the values in two variables: ``` >>> a, b = 'table', 'chair' >>> a, b = b, a >>> print(a) # chair >>> print(b) # table ```
124
The `index` Method | *lists
The `index` method allows you to find the index of a value by passing its name:
125
`append()` | *list
`append` adds an element to the **end** of a list
126
`insert()` | *list
`insert` adds an element to a list at a **given position**:
127
`del()` | *list
`del` removes an item using the `index`:
128
`remove()` | *list
`remove` removes an item with *using actual value of it* *If the value appears multiple times in the list, only the first instance of the value will be removed.
129
`pop()` | *list
By default, `pop` will *remove* and *return* the **last** item of the list. You can also pass the `index` of the element as an *optional* parameter:
130
Sorting values with `sort()` | *lists
- sorts list in place - You can also pass `True` for the reverse keyword argument to have `sort()` sort the values in reverse order. - If you need to sort the values in regular alphabetical order, pass `str.lower` for the key keyword argument in the `sort()` method call.
131
Sorting values with `sorted()` | *lists
You can use the built-in function `sorted` to *return* a **new list**
132
Tuples vs Lists
The key **difference** between `tuples` and `lists` is that, while `tuples` are **immutable** objects, `lists` are **mutable**. This means that `tuples` **cannot be changed** while the `lists` **can be modified.** `Tuples` are *more memory efficient* than the lists. The main way that tuples are different from lists is that `tuples, like strings, are immutable`.
133
Converting between `list()` and `tuple()`
``` >>> tuple(['cat', 'dog', 5]) # ('cat', 'dog', 5) >>> list(('cat', 'dog', 5)) # ['cat', 'dog', 5] >>> list('hello') # ['h', 'e', 'l', 'l', 'o'] ```
134
Python Dictionaries
In Python, a dictionary is an ordered (from Python > 3.7) collection of `key: value` pairs. The main operations on a dictionary are storing a value with some key and extracting the value given the key. It is also possible to delete a key:value pair with `del`.
135
Set `key, value` using subscript operator `[]` | *dict
``` >>> my_cat = { ... 'size': 'fat', ... 'color': 'gray', ... 'disposition': 'loud', ... } >>> my_cat['age_years'] = 2 >>> print(my_cat) {'size': 'fat', 'color': 'gray', 'disposition': 'loud', 'age_years': 2} ```
136
Get `value` using subscript operator `[]` | *dict
# fat ``` >>> my_cat = { ... 'size': 'fat', ... 'color': 'gray', ... 'disposition': 'loud', ... } >>> print(my_cat['size']) ... >>> print(my_cat['eye_color']) # Traceback (most recent call last): # File "", line 1, in # KeyError: 'eye_color' ``` *In case the key is not present in dictionary `KeyError` is raised.
137
`values()` | *dict
The `values()` method gets the values of the dictionary: ``` >>> pet = {'color': 'red', 'age': 42} >>> for value in pet.values(): ... print(value) ... # red # 42 ```
138
`keys()` | *dict
The `keys()` method gets the **keys** of the dictionary. ``` >>> pet = {'color': 'red', 'age': 42} >>> for key in pet.keys(): ... print(key) ... # color # age ``` There is no need to use `.keys()` since by default you will loop through keys. ``` >>> pet = {'color': 'red', 'age': 42} >>> for key in pet: ... print(key) ... # color # age ```
139
`items()` | *dict
The `items()` method gets the items of a dictionary and returns them as a `Tuple`: ``` >>> pet = {'color': 'red', 'age': 42} >>> for item in pet.items(): ... print(item) ... # ('color', 'red') # ('age', 42) ``` Using the `keys()`, `values()`, and `items()` methods, a `for` loop can iterate over the **keys, values, or key-value pairs** in a dictionary, respectively. ``` >>> pet = {'color': 'red', 'age': 42} >>> for key, value in pet.items(): ... print(f'Key: {key} Value: {value}') ... # Key: color Value: red ```
140
`get()` | *dict
The `get()` method returns the value of an item with the given key. If the key doesn’t exist, it returns `None` ``` >>> wife = {'name': 'Rose', 'age': 33} >>> f'My wife name is {wife.get("name")}' # 'My wife name is Rose' >>> f'She is {wife.get("age")} years old.' # 'She is 33 years old.' >>> f'She is deeply in love with {wife.get("husband")}' # 'She is deeply in love with None' ``` You can also change the default `None` value to one of your choice. ``` >>> wife = {'name': 'Rose', 'age': 33} >>> f'She is deeply in love with {wife.get("husband", "lover")}' # 'She is deeply in love with lover' ```
141
Adding items with `setdefault()` | *dict
It’s possible to add an item to a dictionary in this way: ``` >>> wife = {'name': 'Rose', 'age': 33} >>> if 'has_hair' not in wife: ... wife['has_hair'] = True ``` Using the `setdefault` method, we can make the same code more short: ``` >>> wife = {'name': 'Rose', 'age': 33} >>> wife.setdefault('has_hair', True) >>> wife # {'name': 'Rose', 'age': 33, 'has_hair': True} ```
142
`pop()` | *dict
The `pop()` method *removes* and *returns* an item based on **a given key**. ``` >>> wife = {'name': 'Rose', 'age': 33, 'hair': 'brown'} >>> wife.pop('age') # 33 >>> wife # {'name': 'Rose', 'hair': 'brown'} ```
143
`popitem()` | *dict
The `popitem()` method* removes* the **last item** in a dictionary and *returns* it. ``` >>> wife = {'name': 'Rose', 'age': 33, 'hair': 'brown'} >>> wife.popitem() # ('hair', 'brown') >>> wife # {'name': 'Rose', 'age': 33} ```
144
`del()` | *dict
The `del()` method* removes* an item based on **a given key**. ``` >>> wife = {'name': 'Rose', 'age': 33, 'hair': 'brown'} >>> del wife['age'] >>> wife # {'name': 'Rose', 'hair': 'brown'} ```
145
`clear()` | *dict
The `clear()` method *removes **all** the items *in a dictionary. ``` >>> wife = {'name': 'Rose', 'age': 33, 'hair': 'brown'} >>> wife.clear() >>> wife # {} ```
146
Checking `keys` in a Dictionary
``` >>> person = {'name': 'Rose', 'age': 33} >>> 'name' in person.keys() # True >>> 'height' in person.keys() # False >>> 'skin' in person # You can omit keys() # False ```
147
Checking `values` in a Dictionary
``` >>> person = {'name': 'Rose', 'age': 33} >>> 'Rose' in person.values() # True >>> 33 in person.values() # True ```
148
Pretty Printing | *dict
``` >>> import pprint >>> wife = {'name': 'Rose', 'age': 33, 'has_hair': True, 'hair_color': 'brown', 'height': 1.6, 'eye_color': 'brown'} >>> pprint.pprint(wife) # {'age': 33, # 'eye_color': 'brown', # 'hair_color': 'brown', # 'has_hair': True, # 'height': 1.6, # 'name': 'Rose'} ```
149
Merge two dictionaries
``` >>> dict_a = {'a': 1, 'b': 2} >>> dict_b = {'b': 3, 'c': 4} >>> dict_c = {**dict_a, **dict_b} >>> dict_c # {'a': 1, 'b': 3, 'c': 4} ```
150
Python Sets
A `set` is an **unordered collection** with **no duplicate** elements. **Basic uses include *** membership testing* and *eliminating duplicate* entries. ``` >>> s = {1, 2, 3} >>> s = set([1, 2, 3]) >>> s = {} # this will create a dictionary instead of a set >>> type(s) # ``` A `set` *automatically **remove** all the duplicate *values. ``` >>> s = {1, 2, 3, 2, 3, 4} >>> s # {1, 2, 3, 4} ``` And as an **unordered** data type, they *can’t be indexed.* ``` >>> s = {1, 2, 3} >>> s[0] # Traceback (most recent call last): # File "", line 1, in # TypeError: 'set' object does not support indexing ```
151
Initializing a `set`
There are **two** ways to create sets: using curly braces `{}` and the built-in function `set()` NOTE: When creating `set`, be sure to **not use empty curly braces** `{}` or you will get **an empty dictionary instead.**
152
set: `add()` and `update()` | *set
# {1, 2, 3, 4} Using the `add()` method we can add a **single element **to the set. ``` >>> s = {1, 2, 3} >>> s.add(4) >>> s ``` And with `update()`, **multiple ones:** ``` >>> s = {1, 2, 3} >>> s.update([2, 3, 4, 5, 6]) >>> s # {1, 2, 3, 4, 5, 6} ```
153
set: `remove()` and `discard()`
Both methods will remove an element from the set, but `remove()` will raise a key error if the value doesn’t exist. ``` >>> s = {1, 2, 3} >>> s.remove(3) >>> s # {1, 2} >>> s.remove(3) # Traceback (most recent call last): # File "", line 1, in # KeyError: 3 ``` `discard()` won’t raise any errors. ``` >>> s = {1, 2, 3} >>> s.discard(3) >>> s # {1, 2} >>> s.discard(3) ```
154
set: **union**
`union()` or `|` will create** a new set** with *all *the elements from the sets provided. ``` >>> s1 = {1, 2, 3} >>> s2 = {3, 4, 5} >>> s1.union(s2) # or 's1 | s2' # {1, 2, 3, 4, 5} ```
155
set: **intersection**
`intersection()` or `&` will return a set with* only *the elements that are **common to all of them**. ``` >>> s1 = {1, 2, 3} >>> s2 = {2, 3, 4} >>> s3 = {3, 4, 5} >>> s1.intersection(s2, s3) # or 's1 & s2 & s3' # {3} ```
156
set: **difference**
`difference()` or `-` will return *only* the elements that are **unique to the first set** (invoked set). ``` >>> s1 = {1, 2, 3} >>> s2 = {2, 3, 4} >>> s1.difference(s2) # or 's1 - s2' # {1} >>> s2.difference(s1) # or 's2 - s1' # {4} ```
157
set: **symmetric_difference**
`symmetric_difference()` or `^` will return *all the elements* that are **not common** between them.
158
**List** Comprehensions
# ['Charles', 'Susan', 'Patrick', 'George'] List Comprehensions are a special kind of syntax that let us create lists out of other lists, and are incredibly useful when dealing with numbers and with one or two levels of nested for loops. List comprehensions provide a concise way to create lists. [...] or to create a subsequence of those elements that satisfy a certain condition. This is how we create a new list from an existing collection with a `For` Loop: ``` >>> names = ['Charles', 'Susan', 'Patrick', 'George'] >>> new_list = [] >>> for n in names: ... new_list.append(n) ... >>> new_list ``` And this is how we do the same with a List Comprehension: ``` >>> names = ['Charles', 'Susan', 'Patrick', 'George'] >>> new_list = [n for n in names] >>> new_list # ['Charles', 'Susan', 'Patrick', 'George'] ``` We can do the same with numbers: ``` >>> n = [(a, b) for a in range(1, 3) for b in range(1, 3)] >>> n # [(1, 1), (1, 2), (2, 1), (2, 2)] ``` *The basics of `list` comprehensions also apply to sets and dictionaries.
159
Adding conditionals to list comprehensions
If we want `new_list` to have only the names that start with C, with a for loop, we would do it like this: ``` >>> names = ['Charles', 'Susan', 'Patrick', 'George', 'Carol'] >>> new_list = [] >>> for n in names: ... if n.startswith('C'): ... new_list.append(n) ... >>> print(new_list) # ['Charles', 'Carol'] ``` In a List Comprehension, we add the `if` statement at the end: ``` >>> new_list = [n for n in names if n.startswith('C')] >>> print(new_list) # ['Charles', 'Carol'] ``` To use an `if-else` statement in a List Comprehension: ``` >>> nums = [1, 2, 3, 4, 5, 6] >>> new_list = [num*2 if num % 2 == 0 else num for num in nums] >>> print(new_list) # [1, 4, 3, 8, 5, 12] ```
160
**Set **comprehension
``` >>> b = {"abc", "def"} >>> {s.upper() for s in b} {"ABC", "DEF"} ```
161
**Dict** comprehension
``` >>> c = {'name': 'Pooka', 'age': 5} >>> {v: k for k, v in c.items()} {'Pooka': 'name', 5: 'age'} ``` A List comprehension can be generated from a dictionary: ``` >>> c = {'name': 'Pooka', 'age': 5} >>> ["{}:{}".format(k.upper(), v) for k, v in c.items()] ['NAME:Pooka', 'AGE:5'] ```
162
Escape characters
An escape character is created by typing a backslash `\` followed by the character you want to insert.
163
`\'`
Single quote
164
`\"`
Double quote
165
\t
Tab
166
\n
Newline (line break)
167
\\
Backslash
168
\b
backspace
169
\ooo
Octal value
170
\r
carriage return
171
Raw strings
**A raw string** entirely *ignores all escape characters* and** prints** any backslash that appears in the string. *mostly used for regex
172
Multiline Strings
173
Indexing and Slicing strings
``` H e l l o w o r l d ! 0 1 2 3 4 5 6 7 8 9 10 11 ``` Indexing: ``` >>> spam = 'Hello world!' >>> spam[0] # 'H' >>> spam[4] # 'o' >>> spam[-1] # '!' ``` Slicing ``` >>> spam = 'Hello world!' >>> spam[0:5] # 'Hello' >>> spam[:5] # 'Hello' >>> spam[6:] # 'world!' >>> spam[6:-1] # 'world' >>> spam[:-1] # 'Hello world' >>> spam[::-1] # '!dlrow olleH' >>> fizz = spam[0:5] >>> fizz # 'Hello' ```
174
The `in` and `not in` operators *strings
``` >>> 'Hello' in 'Hello World' # True >>> 'Hello' in 'Hello' # True >>> 'HELLO' in 'Hello World' # False >>> '' in 'spam' # True >>> 'cats' not in 'cats and dogs' # False ```
175
`upper()`, `lower()` and `title()` *strings
Transforms a string to upper, lower and title case ``` >>> greet = 'Hello world!' >>> greet.upper() # 'HELLO WORLD!' >>> greet.lower() # 'hello world!' >>> greet.title() # 'Hello World!' ```
176
`isupper()` and `islower()` methods
Returns `True` or `False` after evaluating if a string is in upper or lower case:
177
`isalpha()`
returns `True` if the string consists* only of **letters***.
178
`isalnum()`
returns `True` if the string consists *only of **letters** and **numbers***.
179
`isdecimal()`
returns `True` if the string consists *only of **numbers**.*
180
`isspace()`
returns `True` if the string consists* only of **spaces, tabs, **and **new-lines**.*
181
`istitle()`
returns `True` if the string consists o*nly of words* that **begin** with an **uppercase letter** followed by *only* **lowercase characters.**
182
`startswith()` and `endswith()` *strings
``` >>> 'Hello world!'.startswith('Hello') # True >>> 'Hello world!'.endswith('world!') # True >>> 'abc123'.startswith('abcdef') # False >>> 'abc123'.endswith('12') # False >>> 'Hello world!'.startswith('Hello world!') # True >>> 'Hello world!'.endswith('Hello world!') # True ```
183
`join()`
The`join()` method takes all the items in an iterable, like a *list, dictionary, tuple or set*, and **joins** them into a **string**. You can also specify a *separator*.
184
`split()`
The `split()` method **splits a string** into a **list.** By ***default***, it will use ***whitespace** to separate the items*, but you can also set another character of choice: ``` >>> 'My name is Simon'.split() # ['My', 'name', 'is', 'Simon'] >>> 'MyABCnameABCisABCSimon'.split('ABC') # ['My', 'name', 'is', 'Simon'] >>> 'My name is Simon'.split('m') # ['My na', 'e is Si', 'on'] >>> ' My name is Simon'.split() # ['My', 'name', 'is', 'Simon'] >>> ' My name is Simon'.split(' ') # ['', 'My', '', 'name', 'is', '', 'Simon'] ```
185
Justifying text with `rjust()`, `ljust()` and `center()`
``` >>> 'Hello'.rjust(10) # ' Hello' >>> 'Hello'.rjust(20) # ' Hello' >>> 'Hello World'.rjust(20) # ' Hello World' >>> 'Hello'.ljust(10) # 'Hello ' >>> 'Hello'.center(20) # ' Hello ``` An optional second argument to `rjust()` and `ljust()` will specify a **fill character** apart from a space character: ``` >>> 'Hello'.rjust(20, '*') # '***************Hello' >>> 'Hello'.ljust(20, '-') # 'Hello---------------' >>> 'Hello'.center(20, '=') # '=======Hello========' ```
186
Removing whitespace with `strip()`, `rstrip()`, and `lstrip()`
``` >>> spam = ' Hello World ' >>> spam.strip() # 'Hello World' >>> spam.lstrip() # 'Hello World ' >>> spam.rstrip() # ' Hello World' >>> spam = 'SpamSpamBaconSpamEggsSpamSpam' >>> spam.strip('ampS') # 'BaconSpamEggs' ```
187
`count()` *strings
# 3 Counts the **number of occurrences** of a **given character or substring** in the string it is applied to. Can be optionally provided `start` and `end`** index**. ``` >>> sentence = 'one sheep two sheep three sheep four' >>> sentence.count('sheep') >>> sentence.count('e') # 9 >>> sentence.count('e', 6) # 8 # returns count of e after 'one sh' i.e 6 chars since beginning of string >>> sentence.count('e', 7) # 7 ```
188
`replace()` *strings
# 'Hello, planet!' **Replaces *all* occurences** of a given substring with another substring. Can be optionally provided a `third` argument to* limit *the number of *replacements*. Returns a** new string**. ``` >>> text = "Hello, world!" >>> text.replace("world", "planet") >>> fruits = "apple, banana, cherry, apple" >>> fruits.replace("apple", "orange", 1) # 'orange, banana, cherry, apple' >>> sentence = "I like apples, Apples are my favorite fruit" >>> sentence.replace("apples", "oranges") # 'I like oranges, Apples are my favorite fruit' ```
189
Python String Formatting
The formatting operations described here (`%` operator) exhibit a variety of quirks that lead to a number of common errors [...]. Using the newer formatted string literals [...] helps avoid these errors. These alternatives also provide more powerful, flexible and extensible approaches to formatting text.
190
`%` operator (`%d`) *strings
``` >>> name = 'Pete' >>> 'Hello %s' % name # "Hello Pete" ``` We can use the `%d` format specifier to convert an int value to a string: ``` >>> num = 5 >>> 'I have %d apples' % num # "I have 5 apples" ``` *NOTE: For new code, using `str.format`, or formatted string literals (Python 3.6+) over the `%` operator is strongly recommended.
191
`str.format`
Python 3 introduced a new way to do string formatting that was later back-ported to Python 2.7. This makes the syntax for string formatting more regular. ``` >>> name = 'John' >>> age = 20 >>> "Hello I'm {}, my age is {}".format(name, age) # "Hello I'm John, my age is 20" >>> "Hello I'm {0}, my age is {1}".format(name, age) # "Hello I'm John, my age is 20" ```
192
Formatted String Literals or f-Strings
A formatted string literal or` f-string` is a string literal that is prefixed with `f` or `F`. These strings may contain replacement fields, which are expressions delimited by curly braces `{}`. While *other string literals* **always have a constant value,** *formatted strings are really expressions** evaluated at run time***. ``` >>> name = 'Elizabeth' >>> f'Hello {name}!' # 'Hello Elizabeth!' ``` It is even possible to do inline arithmetic with it: ``` >>> a = 5 >>> b = 10 >>> f'Five plus ten is {a + b} and not {2 * (a + b)}.' # 'Five plus ten is 15 and not 30.' ``` *If your are using Python 3.6+, string `f-Strings` are the recommended way to format strings.
193
Multiline `f-Strings`
``` >>> name = 'Robert' >>> messages = 12 >>> ( ... f'Hi, {name}. ' ... f'You have {messages} unread messages' ... ) # 'Hi, Robert. You have 12 unread messages' ```
194
The `=` specifier *fstrings
This will print the expression and its value: ``` >>> from datetime import datetime >>> now = datetime.now().strftime("%b/%d/%Y - %H:%M:%S") >>> f'date and time: {now=}' # "date and time: now='Nov/14/2022 - 20:50:01'" ```
195
Adding spaces or characters *fstrings
``` >>> f"{name.upper() = :-^20}" # 'name.upper() = -------ROBERT-------' >>> >>> f"{name.upper() = :^20}" # 'name.upper() = ROBERT ' >>> >>> f"{name.upper() = :20}" # 'name.upper() = ROBERT ```
196
Adding **thousands separator** *fstrings
# '10,000,000' ``` >>> a = 10000000 >>> f"{a:,}" ```
197
**rounding** *fstrings
# '3.14' ``` >>> a = 3.1415926 >>> f"{a:.2f}" ```
198
showing as a **Percentage** | *fstrings
# '81.66%' ``` >>> a = 0.816562 >>> f"{a:.2%}" ```
199
Number: 3.1415926 Format: {:.2f} Output...
3.14 Format float 2 decimal places
200
Number: `3.1415926` Format: `{:+.2f}` Output...
`+3.14 ` Format float 2 decimal places with sign
201
Number: `-1` Format: `{:+.2f}` Output...
`-1.00` Format float 2 decimal places with sign
202
Number: `2.71828` Format: `{:.0f}` Output...
`3` Format float with no decimal places
203
Number: `4` Format: `{:0>2d}` Output...
`04` Pad number with zeros (left padding, width 2)
204
Number: `4` Format: `{:x<4d}` Output...
`4xxx` Pad number with x’s (right padding, width 4)
205
Number: `10` Format: `{:x<4d}` Output...
`10xx` Pad number with x’s (right padding, width 4)
206
Number: `1000000` Format: `{:,}` Output...
`1,000,000` Number format with comma separator
207
Number: `0.35` Format: `{:.2%}` Output...
`35.00%` Format percentage
208
Number: `1000000000` Format: `{:.2e}` Output...
`1.00e+09` Exponent notation
209
Number: `11` Format: `{:11d}` Output...
`11` Right-aligned (default, width 10)
210
Number: `11` Format: `{:<11d}` Output...
`11` Left-aligned (width 10)
211
Number: `11` Format: `{:^11d}` Output...
`11` Center aligned (width 10)
212
Template Strings
A simpler and less powerful mechanism, but it is recommended when handling strings generated by users. Due to their reduced complexity, template strings are a safer choice. ``` >>> from string import Template >>> name = 'Elizabeth' >>> t = Template('Hey $name!') >>> t.substitute(name=name) # 'Hey Elizabeth!' ```
213
Regular Expressions
A regular expression (shortened as regex [...]) is a sequence of characters that specifies a search pattern in text. [...] used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation. 1. Import the regex module with `import re.` 2. Create a **Regex object** with the `re.compile()` function. (Remember to use a *raw string.*) 3. Pass the string you want to search into the **Regex object’s** `search()` method. This returns a **Match object**. 4. Call the **Match object’s** `group()` method to *return a string* of the actual matched text.
214
`?` | *regex
**zero** *or* **one** of the preceding group.
215
`*` | *regex
**zero** *or* **more** of the preceding group.
216
`+` | *regex
**one** *or* **more** of the preceding group.
217
`{n}` | *regex
**exactly n** of the preceding group.
218
`{n,}` | *regex
**n** *or* **more** of the preceding group.
219
`{,m}` | *regex
**0 to m** of the preceding group.
220
`{n,m}` | *regex
**at least n** *and* **at most m** of the preceding p.
221
`{n,m}?` or `*?` or `+?` | *regex
performs a non-greedy match of the preceding p.
222
`^spam` | *regex
means the string **must begin** with *spam*.
223
`spam$` | *regex
means the string **must end** with *spam.*
224
`.` | *regex
*any* character, **except** **newline** characters.
225
`\d`, `\w`, and `\s` | *regex
a **digit, word**, or **space** character, respectively.
226
`\D`, `\W`, and `\S` | *regex
**anything *except*** a digit, word, or space, respectively.
227
`[abc]` | *regex
**any** character **between** the brackets (such as a, b, ).
228
`[^abc]` | *regex
**any** character that **isn’t between** the brackets.
229
Matching regex objects | *regex
``` >>> phone_num_regex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d') >>> mo = phone_num_regex.search('My number is 415-555-4242.') >>> print(f'Phone number found: {mo.group()}') # Phone number found: 415-555-4242 ```
230
Grouping regex with parentheses | *regex
using `group()` ``` >>> print(f'Phone number found: {mo.group()}') # Phone number found: 415-555-4242 Grouping with parentheses >>> phone_num_regex = re.compile(r'(\d\d\d)-(\d\d\d-\d\d\d\d)') >>> mo = phone_num_regex.search('My number is 415-555-4242.') >>> mo.group(1) # '415' >>> mo.group(2) # '555-4242' >>> mo.group(0) # '415-555-4242' >>> mo.group() # '415-555-4242' ``` To retrieve** all the groups at once** use the `groups()` method: ``` >>> mo.groups() ('415', '555-4242') >>> area_code, main_number = mo.groups() >>> print(area_code) 415 >>> print(main_number) 555-4242 ```
231
Multiple groups with Pipe *regex
You can use the `| `character anywhere you want to match **one of many** expressions. ``` >>> hero_regex = re.compile (r'Batman|Tina Fey') >>> mo1 = hero_regex.search('Batman and Tina Fey.') >>> mo1.group() # 'Batman' >>> mo2 = hero_regex.search('Tina Fey and Batman.') >>> mo2.group() # 'Tina Fey' ``` You can also use the `pipe` to match **one of several patterns** as part of your regex: ``` >>> bat_regex = re.compile(r'Bat(man|mobile|copter|bat)') >>> mo = bat_regex.search('Batmobile lost a wheel') >>> mo.group() # 'Batmobile' >>> mo.group(1) # 'mobile' ```
232
Optional matching with the Question Mark *regex
The `?` character flags the group that precedes it as an **optional part of the pattern.** ``` >>> bat_regex = re.compile(r'Bat(wo)?man') >>> mo1 = bat_regex.search('The Adventures of Batman') >>> mo1.group() # 'Batman' >>> mo2 = bat_regex.search('The Adventures of Batwoman') >>> mo2.group() # 'Batwoman' ```
233
Matching zero or more with the Star *regex
The `* `(star or asterisk) means “match zero or more”. The group that precedes the star **can occur any number of times** in the text. ``` >>> bat_regex = re.compile(r'Bat(wo)*man') >>> mo1 = bat_regex.search('The Adventures of Batman') >>> mo1.group() 'Batman' >>> mo2 = bat_regex.search('The Adventures of Batwoman') >>> mo2.group() 'Batwoman' >>> mo3 = bat_regex.search('The Adventures of Batwowowowoman') >>> mo3.group() 'Batwowowowoman' ```
234
Matching one or more with the Plus *regex
The `+` (or plus) means match one or more. The group preceding a plus **must appear at least once**: ``` >>> bat_regex = re.compile(r'Bat(wo)+man') >>> mo1 = bat_regex.search('The Adventures of Batwoman') >>> mo1.group() # 'Batwoman' >>> mo2 = bat_regex.search('The Adventures of Batwowowowoman') >>> mo2.group() # 'Batwowowowoman' >>> mo3 = bat_regex.search('The Adventures of Batman') >>> mo3 is None # True ```
235
Matching specific repetitions with Curly Brackets
If you have a group that you want to **repeat a specific number of times**, follow the group in your regex with* a number in curly brackets*: ``` >>> ha_regex = re.compile(r'(Ha){3}') >>> mo1 = ha_regex.search('HaHaHa') >>> mo1.group() # 'HaHaHa' >>> mo2 = ha_regex.search('Ha') >>> mo2 is None # True ``` Instead of one number, you can specify a **range with minimum and a maximum** in between the curly brackets. For example, the regex (Ha){3,5} will match ‘HaHaHa’, ‘HaHaHaHa’, and ‘HaHaHaHaHa’. ``` >>> ha_regex = re.compile(r'(Ha){2,3}') >>> mo1 = ha_regex.search('HaHaHaHa') >>> mo1.group() # 'HaHaHa' ```
236
Greedy and non-greedy matching *regex
Python’s regular expressions are **greedy by default** : in ambiguous situations they *will match the **longest** string possible*. The *non-greedy version* of the curly brackets, which matches the **shortest** string possible, has the *closing curly bracket* **followed by a question mark**. ``` >>> greedy_ha_regex = re.compile(r'(Ha){3,5}') >>> mo1 = greedy_ha_regex.search('HaHaHaHaHa') >>> mo1.group() # 'HaHaHaHaHa' >>> non_greedy_ha_regex = re.compile(r'(Ha){3,5}?') >>> mo2 = non_greedy_ha_regex.search('HaHaHaHaHa') >>> mo2.group() # 'HaHaHa' ```
237
The `findall()` method *regex
# ['415-555-9999', '212-555-0000'] The `findall()` method will *return* the strings of *every match* in the searched string. ``` >>> phone_num_regex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d') # has no groups >>> phone_num_regex.findall('Cell: 415-555-9999 Work: 212-555-0000') # ['415-555-9999', '212-555-0000'] ```
238
Making your own character classes `[ ]` and `[a-zA-Z0-9]` *regex
You can define your own character class using square brackets. For example, the character class `[aeiouAEIOU]` will match *any* vowel, both lowercase and uppercase. You can also include **ranges** of letters or numbers by using **a hyphen**. For example, the character class `[a-zA-Z0-9]` will match *all* lowercase letters, uppercase letters, and numbers. ``` >>> vowel_regex = re.compile(r'[aeiouAEIOU]') >>> vowel_regex.findall('Robocop eats baby food. BABY FOOD.') # ['o', 'o', 'o', 'e', 'a', 'a', 'o', 'o', 'A', 'O', 'O'] ``` By placing a caret character (`^`) just after the character class’s opening bracket, you can make a **negative character class **that will match *all the characters* that are **not** in the character class: ``` >>> consonant_regex = re.compile(r'[^aeiouAEIOU]') >>> consonant_regex.findall('Robocop eats baby food. BABY FOOD.') # ['R', 'b', 'c', 'p', ' ', 't', 's', ' ', 'b', 'b', 'y', ' ', 'f', 'd', '.', ' # ', 'B', 'B', 'Y', ' ', 'F', 'D', '.'] ```
239
Making your own character classes `[^aeiouAEIOU]` *regex
# ['R', 'b', 'c', 'p', ' ', 't', 's', ' ', 'b', 'b', 'y', ' ', 'f', 'd', '.', ' By placing a caret character (`^`) just after the character class’s opening bracket, you can make **a negative character class **that will match* all *the characters that are** not **in the character class: ``` >>> consonant_regex = re.compile(r'[^aeiouAEIOU]') >>> consonant_regex.findall('Robocop eats baby food. BABY FOOD.') # ['R', 'b', 'c', 'p', ' ', 't', 's', ' ', 'b', 'b', 'y', ' ', 'f', 'd', '.', ' # ', 'B', 'B', 'Y', ' ', 'F', 'D', '.'] ```
240
The Caret and Dollar sign characters *regex
- You can also use the caret symbol `^` at the **start of a regex** to indicate that a match **must occur at the beginning** of the searched text. - Likewise, you can put a dollar sign `$` at the **end of the regex** to indicate the string **must end with** this regex pattern. - And you can use the `^` and `$` *together* to indicate that **the entire string must match the regex**. The `r'^Hello’` regular expression string matches strings that begin with ‘Hello’: ``` >>> begins_with_hello = re.compile(r'^Hello') >>> begins_with_hello.search('Hello world!') # <_sre.SRE_Match object; span=(0, 5), match='Hello'> >>> begins_with_hello.search('He said hello.') is None # True ``` The` r'\d\$'` regular expression string matches strings that end with a numeric character from 0 to 9: ``` >>> whole_string_is_num = re.compile(r'^\d+$') >>> whole_string_is_num.search('1234567890') # <_sre.SRE_Match object; span=(0, 10), match='1234567890'> >>> whole_string_is_num.search('12345xyz67890') is None # True >>> whole_string_is_num.search('12 34567890') is None # True ```
241
The Wildcard character
The `. ` (or dot) character in a regular expression will match *any* character **except for a newline**: ``` >>> at_regex = re.compile(r'.at') >>> at_regex.findall('The cat in the hat sat on the flat mat.') ['cat', 'hat', 'sat', 'lat', 'mat'] ```
242
Matching everything with Dot-Star | *regex
The `.*` uses** greedy mode**: It will always try to match *as much text as possible*. ``` >>> name_regex = re.compile(r'First Name: (.*) Last Name: (.*)') >>> mo = name_regex.search('First Name: Al Last Name: Sweigart') >>> mo.group(1) # 'Al' >>> mo.group(2) 'Sweigart' ``` To match any and all text in a** non-greedy** fashion, use the dot, star, and question mark (`.*`?). The question mark tells Python to match in a non-greedy way: ``` >>> non_greedy_regex = re.compile(r'<.*?>') >>> mo = non_greedy_regex.search(' for dinner.>') >>> mo.group() # '' >>> greedy_regex = re.compile(r'<.*>') >>> mo = greedy_regex.search(' for dinner.>') >>> mo.group() # ' for dinner.>' ```
243
Matching newlines with the Dot character | *regex
# 'Serve the public trust.' The `.*` dot-star will match **everything except a newline**. By passing `re.DOTALL` as the *second* argument to `re.compile()`, you can make the dot character match all characters, **including the newline character:** ``` >>> no_newline_regex = re.compile('.*') >>> no_newline_regex.search('Serve the public trust.\nProtect the innocent.\nUphold the law.').group() >>> newline_regex = re.compile('.*', re.DOTALL) >>> newline_regex.search('Serve the public trust.\nProtect the innocent.\nUphold the law.').group() # 'Serve the public trust.\nProtect the innocent.\nUphold the law.' ```
244
Case-Insensitive matching | *regex
# 'Robocop' To make your regex **case-insensitive**, you can pass ` re.IGNORECASE` or `re.I` as a *second* argument to `re.compile():` ``` >>> robocop = re.compile(r'robocop', re.I) >>> robocop.search('Robocop is part man, part machine, all cop.').group() >>> robocop.search('ROBOCOP protects the innocent.').group() # 'ROBOCOP' >>> robocop.search('Al, why does your programming book talk about robocop so much?').group() # 'robocop' ```
245
Substituting strings with the `sub()` method
# 'CENSORED gave the secret documents to CENSORED.' The `sub()` method for Regex objects is passed two arguments: 1. The** first** argument is a string to replace *any matches*. 2. The **second** is the string for the *regular expression*. The `sub()` method returns a string with the substitutions applied: ``` >>> names_regex = re.compile(r'Agent \w+') >>> names_regex.sub('CENSORED', 'Agent Alice gave the secret documents to Agent Bob.') ```
246
Managing complex Regexes | *regex
To tell the `re.compile()` function to **ignore whitespace and comments** inside the regular expression string, “verbose mode” can be enabled by passing the variable `re.VERBOSE` as the *second *argument to `re.compile()`. Now instead of a hard-to-read regular expression like this: ``` phone_regex = re.compile(r'((\d{3}|\(\d{3}\))?(\s|-|\.)?\d{3}(\s|-|\.)\d{4}(\s*(ext|x|ext.)\s*\d{2,5})?)') ``` you can spread the regular expression over multiple lines with comments like this: ``` phone_regex = re.compile(r'''( (\d{3}|\(\d{3}\))? # area code (\s|-|\.)? # separator \d{3} # first 3 digits (\s|-|\.) # separator \d{4} # last 4 digits (\s*(ext|x|ext.)\s*\d{2,5})? # extension )''', re.VERBOSE) ```
247
What are the two main modules in Python that deal with path manipulation.
`os.path` and `pathlib` The `pathlib` module was added in Python 3.4, offering an **object-oriented** way to handle file system paths.
248
Linux and Windows Paths
On Windows, paths are written using backslashes (`\`) as the separator between folder names. On Unix based operating system such as macOS, Linux, and BSDs, the forward slash (`/`) is used as the path separator. Joining paths can be a headache if your code needs to work on different platforms. Fortunately, Python provides easy ways to handle this. We will showcase how to deal with both, `os.path.join` and `pathlib.Path.joinpath`
249
Using `os.path.join` on Windows:
``` >>> my_files = ['accounts.txt', 'details.csv', 'invite.docx'] >>> for filename in my_files: ... print(os.path.join('C:\\Users\\asweigart', filename)) ... # C:\Users\asweigart\accounts.txt # C:\Users\asweigart\details.csv # C:\Users\asweigart\invite.docx ```
250
using `pathlib` on *nix:
``` >>> from pathlib import Path >>> print(Path('usr').joinpath('bin').joinpath('spam')) # usr/bin/spam ``` `pathlib` also provides a shortcut to joinpath using the `/` operator: ``` >>> from pathlib import Path >>> print(Path('usr') / 'bin' / 'spam') # usr/bin/spam ``` Joining paths is helpful if you need to create different file paths under the same directory. ``` >>> my_files = ['accounts.txt', 'details.csv', 'invite.docx'] >>> home = Path.home() >>> for filename in my_files: ... print(home / filename) ... # /home/asweigart/accounts.txt # /home/asweigart/details.csv # /home/asweigart/invite.docx ```
251
The current working directory, using `os` on Windows
``` >>> import os >>> os.getcwd() # 'C:\\Python34' >>> os.chdir('C:\\Windows\\System32') >>> os.getcwd() # 'C:\\Windows\\System32' ```
252
The current working directory, using `pathlib` on *nix
``` >>> from pathlib import Path >>> from os import chdir >>> print(Path.cwd()) # /home/asweigart >>> chdir('/usr/lib/python3.6') >>> print(Path.cwd()) # /usr/lib/python3.6 ```
253
Creating new folders, using `os` on Windows
``` >>> import os >>> os.makedirs('C:\\delicious\\walnut\\waffles') ```
254
Creating new folders, using `pathlib` on *nix
``` >>> from pathlib import Path >>> cwd = Path.cwd() >>> (cwd / 'delicious' / 'walnut' / 'waffles').mkdir() # Traceback (most recent call last): # File "", line 1, in # File "/usr/lib/python3.6/pathlib.py", line 1226, in mkdir # self._accessor.mkdir(self, mode) # File "/usr/lib/python3.6/pathlib.py", line 387, in wrapped # return strfunc(str(pathobj), *args) # FileNotFoundError: [Errno 2] No such file or directory: '/home/asweigart/delicious/walnut/waffles' ``` Oh no, we got a nasty error! The reason is that the ‘delicious’ directory does not exist, so we cannot make the ‘walnut’ and the ‘waffles’ directories under it. To fix this, do: ``` >>> from pathlib import Path >>> cwd = Path.cwd() >>> (cwd / 'delicious' / 'walnut' / 'waffles').mkdir(parents=True) ``` And all is good :)
255
** absolute path**
An **absolute path**, which *always* begins with the **root** folder
256
**relative path**
A **relative path**, which is* relative* to the program’s **current working directory**
257
dot (`.`) and dot-dot (`..`) folders.
These are not real folders, but special names that can be used in a path. A **single period (“dot”)** for a folder name is shorthand for **“this directory.”** **Two periods (“dot-dot”)** means “the** parent** folder.”
258
Handling Absolute paths, using `pathlib` on *nix
``` >>> from pathlib import Path >>> Path('/').is_absolute() # True >>> Path('..').is_absolute() # False ``` extract an absolute path ``` from pathlib import Path print(Path.cwd()) # /home/asweigart print(Path('..').resolve()) # /home ```
259
Handling Relative paths, using `pathlib` on *nix
``` >>> from pathlib import Path >>> print(Path('/etc/passwd').relative_to('/')) # etc/passwd ```
260
Checking if **a file/directory exists**, using `pathlib` on *nix
# True ``` from pathlib import Path >>> Path('.').exists() >>> Path('setup.py').exists() # True >>> Path('/etc').exists() # True >>> Path('nonexistentfile').exists() # False ```
261
Checking if **a path is a file**, using `pathlib` on *nix
# True ``` >>> from pathlib import Path >>> Path('setup.py').is_file() >>> Path('/home').is_file() # False >>> Path('nonexistentfile').is_file() # False ```
262
Checking if a **path is a directory**, using `pathlib` on *nix
# True ``` >>> from pathlib import Path >>> Path('/').is_dir() >>> Path('setup.py').is_dir() # False >>> Path('/spam').is_dir() # False ```
263
Getting a file’s **size in bytes**, using `pathlib` on *nix
``` >>> from pathlib import Path >>> stat = Path('/bin/python3.6').stat() >>> print(stat) # stat contains some other information about the file as well # os.stat_result(st_mode=33261, st_ino=141087, st_dev=2051, st_nlink=2, st_uid=0, # --snip-- # st_gid=0, st_size=10024, st_atime=1517725562, st_mtime=1515119809, st_ctime=1517261276) >>> print(stat.st_size) # size in bytes # 10024 ```
264
**Listing directories**, using `pathlib` on *nix
``` >>> from pathlib import Path >>> for f in Path('/usr/bin').iterdir(): ... print(f) ... # ... # /usr/bin/tiff2rgba # /usr/bin/iconv # /usr/bin/ldd # /usr/bin/cache_restore # /usr/bin/udiskie # /usr/bin/unix2dos # /usr/bin/t1reencode # /usr/bin/epstopdf # /usr/bin/idle3 # ... ```
265
**Directory file sizes**, using `pathlib` on *nix
WARNING: Directories themselves also have a size! So, you might want to check for whether a path is a file or directory using the methods in the methods discussed in the above section. ``` >>> from pathlib import Path >>> total_size = 0 >>> for sub_path in Path('/usr/bin').iterdir(): ... total_size += sub_path.stat().st_size ... >>> print(total_size) # 1903178911 ```
266
Copying files and folders with `shutil`
The `shutil` module provides functions for copying files, as well as entire folders. ``` >>> import shutil, os >>> os.chdir('C:\\') >>> shutil.copy('C:\\spam.txt', 'C:\\delicious') # C:\\delicious\\spam.txt' >>> shutil.copy('eggs.txt', 'C:\\delicious\\eggs2.txt') # 'C:\\delicious\\eggs2.txt' ``` While `shutil.copy()` will copy a single file, `shutil.copytree()` will copy an entire folder and every folder and file contained in it: ``` >>> import shutil, os >>> os.chdir('C:\\') >>> shutil.copytree('C:\\bacon', 'C:\\bacon_backup') # 'C:\\bacon_backup' ```
267
Moving and Renaming, with `shutil`
``` >>> import shutil >>> shutil.move('C:\\bacon.txt', 'C:\\eggs') # 'C:\\eggs\\bacon.txt' ``` The destination path can also specify a filename. In the following example, the source file is moved and renamed: ``` >>> shutil.move('C:\\bacon.txt', 'C:\\eggs\\new_bacon.txt') # 'C:\\eggs\\new_bacon.txt' ``` If there is no eggs folder, then `move()` will rename `bacon.txt` to a file named eggs: ``` >>> shutil.move('C:\\bacon.txt', 'C:\\eggs') # 'C:\\eggs' ```