2 Introduction to Python (II) Flashcards

Matplotlib, dictionaries, dataframes... https://colab.research.google.com/drive/1fKMFrRbIJQE8Tpa06us0qQPnamBn957z?usp=sharing

1
Q

1 What is Matplotlib?

A

A plotting library for the Python programming language and its numerical mathematics extension NumPy. It provides an object-oriented API for embedding plots into applications using general-purpose GUI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

2 Complete code:

import matplot… as …

A

import matplotlib.pyplot as plt

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

3 Make a line plot (year x-axis, pop y-axis)

year=[‘1975’,’1976’,’1977’]
pop=[2340,2405,2890]

A

import matplotlib.pyplot as plt

plt. plot(year,pop)
plt. show()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

4 How to display a matplotlib plot?

A

plt.show()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

5 Print the last item of the list year:

year=[‘1975’,’1976’,’1977’]

A

print(year[-1])

print(year[2])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

6 What is a scatter plot?

A

A type of plot or mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

7 Complete code (scatter plot):

x = [1,3,5]
y= [2,6,7]

’'’import mat….

plt.show()’’’

A

import matplotlib.pyplot as plt

plt. scatter(x,y)
plt. show()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

8 Change the line plot below to a scatter plot

year=[‘1975’,’1976’,’1977’]
pop=[2340,2405,2890]

import matplotlib.pyplot as plt

plt. plot(year,pop)
plt. show()

A

plt. scatter(year,pop)

plt. show()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

9 Put the x-axis on a logarithmic scale

day=[‘1’,’2’,’3’]
virus=[18,55,320]

import matplotlib.pyplot as plt

plt. scatter(day,virus)
plt. show()

A

plt. scatter(day,virus)
plt. xscale(‘log’)
plt. show()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

10 What is a correlation coefficient?

A

A value that indicates the strength of the relationship between variables. The coefficient can take any values from -1 to 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

11 What is a histogram?

A

An approximate representation of the distribution of numerical or categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

12 Create histogram

years = [1975,1976,1978,1975]

A

import matplotlib.pyplot as plt

plt. hist(years)
plt. show()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

13 Create histogram with 5 bins using data (list)

data = [random.randint(1, 5) for _ in range(100)]

A

plt.hist(data,bins=5)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

14 What is the use of plt.clf() ?

A

Cleans a plot up again so you can start afresh

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

15 You want to visually assess if the grades on your exam follow a particular distribution. Which plot do you use?

A

Histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

16 You want to visually assess if longer answers on exam questions lead to higher grades. Which plot do you use?

A

Scatter plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

17 Add labels

year =list(range(1975,2000))
scores = list(range(1,26))

plt.scatter(year,scores)

A

plt. xlabel(‘year’)
plt. ylabel(‘scores’)
plt. show()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

18 Add ‘scores’ as a title

data = [int(random.randint(1, 5)) for _ in range(100)]
plt.hist(data,bins=5)

plt.plot()

A

plt.title(‘years’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

19 Add log scale

year =list(range(1975,2000))
scores= [2**n for n in range(25)]

plt.scatter(year,scores)

A

plt. yscale(‘log’)

plt. show()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

20 What are ticks in matplotlib?

A

Ticks are the values used to show specific points on the coordinate axis. It can be a number or a string.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

21 What is a legend in matplotlib?

A

The legend of a graph reflects the data displayed in the graph’s Y-axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

22 Change the ticks in the x-axis to strings

x=[1, 3, 5]
y=[1, 5, 9]

import matplotlib.pyplot as plt
plt.scatter(x,y)

A

plt. xticks(x, [“one”,”three”,”five”])

plt. show()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

23 Write a scatter plot with gdp as independent variable and population size as the size argument

gdp=[100, 200, 300]
life_exp=[50, 70, 82]
pop_size=[30,20,40]

A

import matplotlib.pyplot as plt

plt. scatter(gdp, life_exp, s =pop_size)
plt. show()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

24 What is a dependent variable?

A

A variable (often denoted by y ) whose value depends on that of another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

25 What is an independent variable?

A

A variable (often denoted by x ) whose variation does not depend on that of another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

26 Code: Scatter plot with text ‘A’ pointing at the second element

gdp=[100, 200, 300]
life_exp=[50, 70, 82]

A

import matplotlib.pyplot as plt

plt. scatter(gdp, life_exp)
plt. text(195,65,’A’)
plt. show()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

27 Add a grid to a matplot figure

A

plt.grid(True)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

28 Get the position of germany

countries = [‘spain’, ‘france’, ‘germany’, ‘norway’]

A

countries.index(‘germany’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

29 What is the difference between list and dictionary in Python?

A

A list is an ordered sequence of objects, whereas dictionaries are unordered sets. But the main difference is that items in dictionaries are accessed via keys and not via their position.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

30 Get the keys

europe = {‘spain’:’madrid’, ‘france’:’paris’, ‘germany’:’berlin’, ‘norway’:’oslo’ }

Outcome:
dict_keys([‘spain’, ‘france’, ‘germany’, ‘norway’])

A

print(europe.keys())

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

31 Get the capital of norway

europe = {‘spain’:’madrid’, ‘france’:’paris’, ‘germany’:’berlin’, ‘norway’:’oslo’ }

Outcome: oslo

A

print(europe[‘norway’])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

32 Add italy and rome to the dictionary

europe = {‘spain’:’madrid’, ‘france’:’paris’,
‘germany’:’berlin’ }

A

europe[‘italy’]=’rome’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

33 Check whether the dictionary has spain

europe = {‘spain’:’madrid’, ‘france’:’paris’,
‘germany’:’berlin’ }

A

print(‘spain’ in europe)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

34 Outcome of:

europe = {‘spain’:’madrid’, ‘france’:’paris’, ‘germany’:’berlin’, ‘norway’:’oslo’ }

print(‘madrid’ in europe)

A

FALSE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

35 Delete spain

europe = {‘spain’:’madrid’, ‘france’:’paris’,
‘norway’:’oslo’}

A

del(europe[‘spain’])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

36 Update the capital of spain with madrid

europe = {‘spain’:’Barcelona’, ‘france’:’paris’,
‘norway’:’oslo’}

A

europe[‘spain’]=’madrid’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

37 Get the capital of france

europe = { ‘spain’:
{ ‘capital’:’madrid’, ‘population’:46.77 },
‘france’: { ‘capital’:’paris’, ‘population’:66.03 }}

A

print(europe[‘france’][‘capital’])

38
Q

38 Complete Code

dr =[False, False, True]
names = ['Spain','France','UK']
...
...
#Outcome:
 country drives_right
0 Spain False
1 France False
2 UK True
A

import pandas as pd

my_dict={‘country’:names, ‘drives_right’:dr}

print(pd.DataFrame(my_dict))

39
Q

39 Use row_labels as index of the dataframe

ages = [i for i in range(3)]
df_ages = pd.DataFrame(ages, columns = ['Ages'])
names = ['Jon','Jorge','Ana']

Ages
Jon 0
Jorge 1
Ana 2

A

df_ages.index = names

print(df_ages)

40
Q

40 Transform the csv to a dataframe called cars

cars.csv

A

import pandas as pd

cars = pd.read_csv(‘cars.csv’)

41
Q

41 Set the first column as row labels

import pandas as pd
cars = pd.read_csv(‘cars.csv’,..(code)..)

A

cars = pd.read_csv(‘cars.csv’, index_col = 0)

42
Q

42 What is a panda series?

A

A one-dimensional labeled array capable of holding data of any type (integer, string, float, python objects, etc.). Pandas Series is nothing but a column in an excel sheet.

43
Q

43 Print the column country of df as Panda Series

countries = [‘Spain’,’France’,’UK’]
df =pd.DataFrame(countries, columns = [‘country’])

0 Spain
1 France
2 UK
Name: country, dtype: object

A

print(df[[‘country’]])

44
Q

44 Print the column country of df as dataframe

countries = [‘Spain’,’France’,’UK’]
df =pd.DataFrame(countries, columns = [‘country’])

#Outcome:
 country
0 Spain
1 France
2 UK
A

print(df[[‘country’]])

45
Q

45 Print out columns a, b from df

A

print(df[[‘a’,’b’]])

46
Q

46 Print out first 2 observations (2 methods)

import pandas as pd
n = [i for i in range(3)]
df =pd.DataFrame(n, columns = [‘number’])

A

Outcome

print(df[:2])
print(df.head(2))

number
0 0
1 1

47
Q

47 Print out the fourth, fifth and sixth observation

import pandas as pd
n = [i for i in range(0,20,2)]
df =pd.DataFrame(n, columns = [‘number’])

A

print(df.iloc[3:6])

48
Q

48 What is loc in python?

A

A method that takes only index labels and returns row or dataframe if the index label exists in the caller data frame

49
Q

49 What is a DataFrame in Python?

A

is two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns)

50
Q

50 Use iloc to get jon’s row as dataframe

name age
0 nick 15
1 jon 18

A

Outcome:

df.iloc[1,]

name jon
age 18
Name: 1, dtype: object

51
Q

51 Use iloc to get nick value

name age
0 nick 15
1 jon 18

#Outcome:
nick
A

print(df.iloc[0,0])

52
Q

52 Use loc to get nick’s row as dataframe

name age
rank_1 nick 15
rank_2 jon 18

A

Outcome:

print(df.loc[[‘rank_1’]])

name age
rank_1 nick 15

53
Q

53 Output of:

dict ={'name': ['nick','jon'],
 'age':[15,18]}
index_rows = ['rank_1','rank_2']
df = pd.DataFrame(dict)
df.index = index_rows

df.loc[‘rank_2’]

A

name jon
age 18
Name: rank_2, dtype: object

54
Q

54 Use loc to get jon’s age:

name age
rank_1 nick 15
rank_2 jon 18

A

df.loc[‘rank_2’,’age’]

55
Q

55 Use iloc to get age column as a dataframe

name age
rank_1 nick 15
rank_2 jon 18

A

df.iloc[:,[1]]

56
Q

56 Outcome of:

print(True == False)

A

FALSE

57
Q

57 Outcome of:

print(- 1!= 75)

A

TRUE

58
Q

58 Outcome of:

print(True == 1)

A

TRUE

59
Q

59 Outcome of:

print(True == 0)

A

FALSE

60
Q

60 Outcome of:

x = -3 * 6
print(x>=-10)

A

FALSE

61
Q

61 Complete code:

import numpy as np
my_house = np.array([18.0, 20.0, 10.75])

#Outcome: 
[ True True False]
A

There are many possible answer

#Answer:
print(my_house>11)
62
Q

62 List out and name comparison operators

A
Equal: 2 == 2 True
Not equal: 2 != 2 False
Greater than: 2 > 3 False
Less than: 2 < 3 True
Greater than or equal to: 2 >= 3 True
Less than or equal to: 2 <= 3 True
63
Q

63 Outcome of:

a,b =[2,3]
a > b and a < b

A

FALSE

64
Q

64 Outcome of:

a,b =[2,3]
a > b or a < b

A

TRUE

65
Q

65 Outcome of:

a,b =[2,3]
not(a < 3)

A

FALSE

66
Q

66 List out the three Numpy Boolean operators

A

np. logical_and()
np. logical_or()
np. logical_not()

67
Q

67 Use a numpy boolean

my_house = np.array([18.0, 20.0, 10.75])

A

print(np.logical_and(my_house>18, my_house<21))

68
Q

68 What is flow control statement in python

A

Order in which the program’s code executes. The control flow of a Python program is regulated by conditional statements, loops, and function calls.

69
Q

69 Outcome of:

for i in range(4):
 if(i <2) :
 print("small")
 elif(i ==2 ) :
 print("medium")
 else :
 print("large")
A

small
small
medium
large

70
Q

70 Complete code:

house=[2,4,6]
...house:
 ...(i <4) :
 print("small")
 ...(i ==4 ) :
 print("medium")
 else :
 print("large")

small
medium
large

A
house=[2,4,6]
for i in house:
 if(i <4) :
 print("small")
 elif(i ==4 ) :
 print("medium")
 else :
 print("large")
71
Q

Outcome:

#71 Filtering in pandas
#Complete code

name age
0 nick 15
1 jon 18

filter_= …
selection= df[filter_]
print(selection)

name age
0 nick 15

A

filter_ = df[‘name’] == ‘nick’
selection =df[filter_]
print(selection)

72
Q
#Filtering in pandas
#Complete code

Name Country
rank1 Tom Spain
rank2 Jack USA

…[df……]

#Outcome: 
 Name Country
rank1 Tom Spain
A

df[df[‘Country’]==’Spain’]

73
Q

72 Complete code using np boolean and

data = [['tom', 10], ['nick', 15], ['juli', 14]]
df = pd.DataFrame(data, columns = ['Name', 'Age'])

age = …
between = np…(…>10,..<15)
df[]

Name Age
2 juli 14

A

age = df[‘Age’]
between = np.logical_and(age>10,age<15)
df[between]

74
Q

73 Complete code

x = 1
…x < 4 :
print(x)
x = x…

1
2
3

A

x = 1
while x < 4 :
print(x)
x = x + 1

75
Q

74 Outcome of:

offset=4

while offset !=0:
offset=offset-1
print(‘correcting…’)
print(offset)

A
correcting...
3
correcting...
2
correcting...
1
correcting...
0
76
Q

75 Loop over areas and print each element

areas = [11.25, 18.0, 20.0, 10.75, 9.50]

A

for area in areas :

print(area)

77
Q

76 Loop and enumerate

areas = [11.25, 18.0, 20.0]

1-11.25
2-18.0
3-20.0

A

for index, area in enumerate(areas,1) :

print( str(index)+ “-“ + str(area))

78
Q

77 Loop and use enumerate

house = [[“hallway”, 11.25],
[“kitchen”, 18.0],
[“living room”, 20.0]]

hallway-11.25
kitchen-18.0
living room-20.0

A

for x in house :

print( str(x[0]) + “-“ + str(x[1]) )

79
Q

Outcome:

#78 Loop over dictionary
#Complete code

world = { “afghanistan”:30.55,
“albania”:2.77,
“algeria”:39.21 }

for …in world …() :
…(key + “ – “ + str(value))

afghanistan – 30.55
albania – 2.77
algeria – 39.21

A

for key, value in world.items() :

print(key + “ – “ + str(value))

80
Q

79 Outcome of:

import numpy as np
x = [i for i in range(1,8,2)]
np_x=np.array(x)
for i in np_x:
print(i**2)
A

1
9
25
49

81
Q
#80 Loop over DataFrame
(two ways)

name age
rank_1 nick 15
rank_2 jon 18

#Output: 
rank_1
15
rank_2
18
A

for ind,col in df.iterrows():
print(ind)
print(col[1])

82
Q

81 Build this dataframe:

Name Country
rank1 Tom Spain
rank2 Jack USA

A

import pandas as pd

data = {'Name':['Tom', 'Jack'],'Country':['Spain','USA']}
df = pd.DataFrame(data, index =['rank1', 'rank2'])
83
Q

82 Loop over the dataframe and create a column with the length of them names

Name Country
0 Tom Spain
1 Jack USA

A

for lab, row in df.iterrows() :
df.loc[lab, “name_length”] = len(row[“Name”])

Outcome:
Name Country name_length
0 Tom Spain 3.0
1 Jack USA 4.0

84
Q

83 How does work .seed() method?

A

Seeding a pseudo-random number generator gives it its first “previous” value. Each seed value will correspond to a sequence of generated values for a given random number generator.

85
Q

84 Generate the same random number twice

A

import numpy as np
np.random.seed(123) #any number
print(np.random.rand())

np.random.seed(123)
print(np.random.rand())

86
Q

85 Use randint() to simulate the throw of a dice

A

print(np.random.randint(1,7))

87
Q

86 Use control flow and random numbers to simulate a simple walk with a dice:

Instructions:

np.random.seed(124)

1 or 2 is a step back
3 or 4 no step
5 or 6 step forward

dice: 5
step: 1

A
import numpy as np
np.random.seed(124)
step = 0
dice=np.random.randint(1,7)
if dice <= 2 :
 step = step - 1
elif dice>4 :
 step=step+1
else:
 step = step

print(‘dice:’,dice)
print(‘step:’,step)

88
Q

Outcome:

#87 Simulate a random walk with a dice:
#How many meters did the ‘person’ advance:

Instructions:
np.random.seed(124)

1 or 2 is a step back
3 or 4 no step
5 or 6 step forward

steps_walked: 10
meters_forward: 3

A

np.random.seed(124)

random_walk=[0]
step = 0
for i in range(10):
dice=np.random.randint(1,7)
if dice <= 2 :
 step = step - 1
elif dice>4 :
 step=step+1
else:
 step = step
random_walk.append(random_walk[-1]+step)
meters_forward = random_walk[-1]
steps_walked = len(random_walk)-1 #First step is 0

print(‘steps_walked:’, steps_walked)
print(‘meters_forward:’, meters_forward)

89
Q

88 Get the maximum value of this list comprehension

[i for i in range(10)]

A

max_value=max([i for i in range(10)])

90
Q

89 What are list comprehensions used for?

A

They are used for creating new lists from other iterables.

91
Q

random_walk =[0,1,2,3,2,3,4,5,6] 0=starting position

#90 Get the amount the meters advance in this random_walk.
#Get the number of steps given
#Use matplotlib line plot to display the walk

steps_walked: 8
meters_forward: 3

A

random_walk =[0,1,2,3,2,3,4,5,6] 0=starting position

#Get the amount the meters advance in this random_walk.
#Get the number of steps given
#Use matplotlib line plot to display the walk

import matplotlib.pyplot as plt

random_walk =[0,1,1,0,-1,0,1,2,3]
steps_walked = len(random_walk) -1
meters_forward = random_walk [-1]
print('steps_walked:',steps_walked)
print('meters_forward:',meters_forward)

plt. plot(random_walk)
plt. show()