Numpy Flashcards

Taken from https://jakevdp.github.io/PythonDataScienceHandbook/02.02-the-basics-of-numpy-arrays.html

1
Q

What is common acronym for numpy when importing?

A

import numpy as np

2
Q

What is numpy short for?

A

Numerical Python

3
Q

What does numpy provide?

A

An efficient interface to store and operate on dense data buffers.

4
Q

How are numpy arrays better than Python lists - 1?

A

NumPy arrays are like Python’s built-in list type, but NumPy arrays provide much more efficient storage and data operations as the arrays grow larger in size.

5
Q

What is the difference between a dynamic-type list and a fixed-type (numpy style) array?

A

At the implementation level, the array essentially contains a single pointer to one contiguous block of data. The Python list, on the other hand, contains a pointer to a block of pointers, each of which in turn points to a full Python object

6
Q

Can numpy arrays contain elements of mixed type?

A

No. All elements must be the same.

7
Q

How do you create a fixed-type array in Python?

A

import array
L = list(range(10))
A = array.array(‘i’, L)

returns: array(‘i’, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

where ‘i’ is type code indicating integers

8
Q

How do you create an numpy array from Python list?

A

np.array([1, 4, 2, 5, 3])

9
Q

What happens when creating numpy arrays which are not of the same type?

A

Numpy will attempt to upcast if possible. ints upcast to floats

10
Q

How do we explicitly set the data type of an numpy array?

A

np.array([1, 2, 3, 4], dtype=’float32’)

11
Q

Is it more efficient to create a numpy array from scratch or from existing list?

A

From scratch - especially for larger arrays

12
Q

Create numpy array all zeros

A

np.zeros(10, dtype=int) # 10 indicates 10 values

13
Q

Create numpy array all ones

A

np.ones(10, dtype=int) # 10 indicates 10 values

14
Q

Create numpy array with 1 specific value

A

np.full(10, 3.14, dtype=int) # 10 indicates 10 values, 3.14 is the value to include

15
Q

Create multi-dimensional numpy array

A

np.zeros((3,5), dtype=float) # create a 3x5 array poulated with zeros in float type

16
Q

What is syntax for numpy array filled with a linear sequence?

A

np.arange(0, 20, 2) # values from 0 up to, not including 20, step by 2

17
Q

What is syntax for numpy array of x values equally spaced btw two values?

A

np.linspace(0, 1, 5)

where 0 and 1 are bounding values, and 5 = x

18
Q

How do you create a numpy array of 3x3 random ints between 0 and 10?

A

np.random.randint(0, 10, (3, 3))

19
Q

What are the attributes of an numpy array, where array = x3.

A

x3. ndim = number of dimensions
x3. shape = shape of the array
x3. size = number of elements in array
x3. dtype = datatype

20
Q

How do you access the i-th element in a 1 dimensional numpy array x?

A

x[i] - same as for Python lists

21
Q

How do you access the elements in a multi-dimensional numpy array x?

A

x[2,0] # will return 1 for below array

array([[3, 5, 2, 4],
[7, 6, 8, 8],
[1, 6, 7, 7]])

22
Q

How do you modify the elements in a multi-dimensional numpy array x?

A

x[2,0] = 12 # will set the below value 1 to 12

array([[3, 5, 2, 4],
[7, 6, 8, 8],
[1, 6, 7, 7]])

NB. Numpy arrays are fixed type. If you attempt to set value to a float in above array then it will be silently truncated.

23
Q

What is numpy array slicing for 1-D arrays?

A

Similar to List slicing, same syntax used.

x[:5] # first five elements

24
Q

How does numpy slicing work for multi-dimensional arrays?

A

x2[:2, :3] # two rows, three columns

array([[12, 5, 2, 4],
[ 7, 6, 8, 8],
[ 1, 6, 7, 7]])

returns for the above array
array([[12, 5, 2],
[ 7, 6, 8]])

25
Q

How can you reverse a multi-dimensional array?

A

x2[::-1, ::-1]

26
Q

How can you access single rows/columns in multi-dimensional arrays?

A

print(x2[:, 0]) # first column of x2

print(x2[0, :]) # first row of x2

In the case of row access, the empty slice can be omitted for a more compact syntax: print(x2[0])

27
Q

Does numpy array slicing return a view or a copy of the array data?

A

A view. This differs from Python slicing, where slices are copies.

This means if a sliced array is altered, then the original array is also altered. This is useful when working with large arrays as it means we can access and process specific parts of the dataset.

28
Q

How can a copy of array data be made with array slicing?

A

x2_sub_copy = x2[:2, :2].copy()

This means if the sub array is modified, the original array will not be updated.

29
Q

How can numpy arrays be reshaped - 1?

A

If you want to put the numbers 1 through 9 in a 3×3 grid, you can do the following:

grid = np.arange(1, 10).reshape((3, 3))
print(grid)

NB. Note that for this to work, the size of the initial array must match the size of the reshaped array

30
Q

How can numpy arrays be reshaped - 2?

A

```# row vector via newaxis
x[np.newaxis, :]```
```# column vector via newaxis
x[:, np.newaxis]```
31
Q

How can you concatenate numpy arrays?

A

Use the concatenate method…

x = np.array([1, 2, 3])
y = np.array([3, 2, 1])
np.concatenate([x, y]) # a list of arrays must be passed

Can also be used on multi-dimensional arrays

32
Q

How can you concatenate numpy arrays of different dimensions?

A

Can use np.concatenate if possible but clearer to use np.vstack and np.hstack

33
Q

How can a numpy array be split?

A

Use np.split …

x = [1, 2, 3, 99, 99, 3, 2, 1]
x1, x2, x3 = np.split(x, [3, 5])
print(x1, x2, x3)

34
Q

Why is numpy important in the Data Science world?

A

It provides an easy and flexible interface to optimized computation with arrays of data

35
Q

What is the key to making numpy computation fast?

A

Use vectorized operations - generally implemented through universal functions (ufuncs)

36
Q

What are numpy universal functions (ufuncs)?

A

They can be used to make repeated calculations on array element more efficient by using vectorized operations. They perform an operation on an array, which is then applied to each element … designed to push the loop in the compiled layer than underlies numpy.

37
Q

Give an example of some numpy ufuncs?

A

np. subtract or -
np. multiply or *
np. power or **

e.g.
x = np.arange(4)
x+5

x = [0 1 2 3]
x + 5 = [5 6 7 8]

38
Q

How can you sum all elements of an numpy array?

A

np.sum(x)

Use np.sum over python’s sum function, since np version much quicker

39
Q

What are multi-dimensional aggegrates in numpy?

A

One common type of aggregation operation is an aggregate along a row or column. For a multi-dimensional array such as

M = np.random.random((3, 4))

running M.sum() will return sum of all elements
However, running M.min(axis=0) will return the min value for each column … axis=1, will return values for rows

40
Q

In multi dimensional aggregate functions, how does the axis keyword work?

A

The axis keyword specifies the dimension of the array that will be collapsed, rather than the dimension that will be returned. So specifying axis=0 means that the first axis will be collapsed: for two-dimensional arrays, this means that values within each column will be aggregated.

41
Q

A

Broadcasting is simply a set of rules for applying binary ufuncs (e.g., addition, subtraction, multiplication, etc.) on arrays of different sizes.

42
Q

Give an example of Broadcasting that adds a scalar to an array

A

a = np.array([0, 1, 2])
a + 5

We can think of this as an operation that stretches or duplicates the value 5 into the array [5, 5, 5], and adds the results. The advantage of NumPy’s broadcasting is that this duplication of values does not actually take place, but it is a useful mental model as we think about broadcasting

43
Q

Give an example of Broadcasting that adds a 1-D array to a 2-D array

A

a = np.array([0, 1, 2])
M = np.ones((3, 3))
M + a

Here the one-dimensional array a is stretched, or broadcast across the second dimension in order to match the shape of M.

44
Q

A

a = np.arange(3)
b = np.arange(3)[:, np.newaxis]
a + b

a = [0 1 2]
b = [[0]
[1]
[2]]

becomes …

array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4]])

45
Q

Give an example of comparison operators used with ufuncs

A

x = np.array([1, 2, 3, 4, 5])

x < 3 # this is the example

returns … array([ True, True, False, False, False], dtype=bool)

Could also use:

x <= 3
x >= 3
x == 3
(2 * x) == (x ** 2) # compound expression

46
Q

How can you count entries in a Boolean numpy array?

A

Given x =
[[5 0 3 3]
[7 9 3 5]
[2 4 7 6]]

```# how many values less than 6?
np.count_nonzero(x < 6)```

47
Q

How can you count entries in a Boolean numpy array - different to np.count_nonzero?

A

Given x =
[[5 0 3 3]
[7 9 3 5]
[2 4 7 6]]

np.sum(x < 6)
answer = 8 # False is interpreted as 0, and True is interpreted as 1

The benefit of using sum() is that summation can be done across rows or columns ….

```# how many values less than 6 in each row?
np.sum(x < 6, axis=1)```
48
Q

How can you check that any or all Boolean values in an array are true?

A

np. any(x > 8) # returns True or False
np. all(x > 8) # returns True or False

Can also use with axis keyword

49
Q

How can you use Boolean arrays as masks in numpy?

A

with x =
array([[5, 0, 3, 3],
[7, 9, 3, 5],
[2, 4, 7, 6]])

To select values where x < 5, do:

x[x < 5] # returns

array([0, 3, 3, 3, 2, 4])

Fuller example …

```# construct a mask of all rainy days
rainy = (inches > 0)```
```# construct a mask of all summer days (June 21st is the 172nd day)
days = np.arange(365)
summer = (days > 172) &amp; (days < 262)```

print(“Median precip on rainy days in 2014 (inches): “,
np.median(inches[rainy]))
print(“Median precip on summer days in 2014 (inches): “,
np.median(inches[summer]))
print(“Maximum precip on summer days in 2014 (inches): “,
np.max(inches[summer]))
print(“Median precip on non-summer rainy days (inches):”,
np.median(inches[rainy & ~summer]))

50
Q

What is fancy indexing?

A

It means passing an array of indices to access multiple array elements at once

51
Q

Give an example of fancy indexing

A

import numpy as np
rand = np.random.RandomState(42)

x = rand.randint(100, size=10)
ind = [3, 7, 4]
x[ind]

returns … array([71, 86, 60])

52
Q

What is the shape of the result of fancy indexing?

A

The shape of the result reflects the shape of the index arrays rather than the shape of the array being indexed:

e.g. if a 2-D array of indices is used to access a 1-D array. The results will be presented in 2-D

53
Q

Does fancy indexing work in multiple dimensions?

A

Yes.

X = np.arange(12).reshape((3, 4))
row = np.array([0, 1, 2])
col = np.array([2, 1, 3])
X[row, col]

returns … array([ 2, 5, 11])

Notice that the first value in the result is X[0, 2], the second is X[1, 1], and the third is X[2, 3]