Python Flashcards

(187 cards)

1
Q

Dataframes: How to select all rows of a df where in col1 “Black” and “White” appear?

A

df[df[“col1”].isin([“Black”,”White”])]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Dataframes: How to select 2 columns?

A

df[[“col1”,”col2”]]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Dataframes: How to sort 2 columns of a df in descending order?

A

df.sort_values([“col1”,”col2”], ascending=[False, False])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Dataframes: How to get the values of a df?

A

df.values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Dataframes: How to get the description of a df?

A

df.describe()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Dataframes: How to get a shape of a df?

A

df.shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Dataframes: How to get info about a df?

A

df.info()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Merging to the nearest date-times: How to replicate VLOOKUP(range_lookup=TRUE)?

A

pd.merge_asof(df1,df2,direction=”backward”) - direction:backward, forward, nearest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Advantage of pd.merge_ordered() vs. pd.merge?

A

pd.merge_ordered() has fill_method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Joining data: 2dfs (current, drafted) merged on column name and having the suffixes (‘_current’, ‘_drafted’) - how to sort the new df based on the column on which was merged?

A

current.merge(drafted, on=”name”, suffixes=(“_current”, “_drafted”), sort=True)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Joining data: Merge 2dfs (current, drafted) with an outer join and suffixes (‘_current’,’_drafted’) - how to show from which df a value is coming from?

A

current.merge(drafted, how=”outer”, on=”name”, suffixes=(“_current”,”_drafted”), indicator=True)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Joining data: Merging the df teams having the column player_id with df positions having player_id as index?

A

teams.merge(positions, left_on=”player_id”, right_index=True)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the joint default values for pd.merge and pd.merge_ordered?

A

pd.merge=inner join, pd.merge_ordered=outer join

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Dataframes: How to use pd.merge_ordered with “ffill”?

A

pd.merge_ordered(df1, df2, on=”Date”, fill_method=”ffill”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Dataframes: How to merge two dataframes with outer and sort index column called “Date”?

A

pd.merge(df1, df2, how=”outer”).sorted_values(“Date”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Dataframes: How to join dataframes with integrated .join function with an inner join?

A

df1.join(df2, how=”inner”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Dataframes: How to make an “inner” join with pd.merge on 2 columns and 2 suffixes?

A

pd.merge(df1, df2, on=[“col1”,”col2”], suffixes=[“_1”,”_2”], how=”inner”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Dataframes: How to merge two dataframes on different columns in both dataframes?

A

pd.merge(df1,df2, left_on=”col1”, right_on=”col2”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Dataframes: How to merge on several columns and/with using suffixes?

A

pd.merge(df1,df2,on=[“col1”,”col2”],suffixes=[“_1”,”_2”])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Dataframes: How to merge on several columns?

A

pd.merge(df1,df2,on=[“col1”,”col2”])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Dataframes: How to merge on one specific column?

A

pd.merge(df1,df2,on=”col1”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Dataframes: What happens if pd.merge() is applied on 2 dataframes without any additional arguments?

A

computes a merge on all columns that occur in both dataframes, this is by default an “inner” join because it glues together only rows that match in the joining column of BOTH dataframes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Dataframes: Pandas: How to extend/ replace concat() with the ability to align rows using multiple columns?

A

pd.merge()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Dataframes: how to sort data by index?

A

df.sort_index()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Dataframes: How to use pd.concat to apply multi-indexing?
df_new=pd.concat([df1,df2],keys=[2020,2021],axis=0)
26
Dataframes: Print all rows that contain the name "Tom" in the column "name"
print(df.loc[df["name"]=="Tom"])
27
Dataframe: When using .append: how to make a new RangeIndex of unique integers for each row?
df.total=df1.append(df2,ignore_index=True)
28
Concatenating rows: How to stack them horizontally?
pd.concat([df1,df2],axis=1)
29
Concatenating rows: How to stack them vertically?
pd.concat([df1,df2], axis=0)
30
Dataframes: How to get the name of an index?
df.index.name
31
Dataframes: How to use concat and reset index?
df_new=pd.concat([df1,df2], ignore_index=True)
32
Dataframes: How to use "append" with reset index?
df_new=df.append(df2).reset_index(drop=True)
33
Dataframes: How to replace all "F" with "C" in all column names in a df?
df_new.columns=df.columns.str.replace("F","C")
34
Dataframes: with .add(): how to add several dataframes with a fill_value=0?
bronze.add(silver,fill_value=0).add(gold,fill_value=0)
35
Dataframes: How to add one dataframe to another?
df1.add(df2)
36
Dataframe: How to add one dataframe to another and use 0 if value doesn't exist?
df1.add(df2, fill_value=0)
37
Dataframes: how to calculate the percent changes?
df.pct_change()
38
Dataframes: How to divide one dataframe by another?
df.divide(df2, axis="rows")
39
Dataframes: How to use .loc for column "sales" for "Jan" to "Mar"?
df.loc["Jan":"Mar","sales"]
40
Dataframes: How to use .loc for the columns "Sales" and "Price" for "Jan" to "Mar"?
df.loc["Jan":"Mar",["Sales","Price"]]
41
Dataframes: if index is a date column how can the dates be parsed?
pd.read_csv("sales.csv", index_col="Dates", parse_dates=True)
42
Dataframes: When reindexing: time chaining method to replace the null values with the last preceding non-null value?
sales.reindex().ffill()
43
Dataframe: How to sort the index in reverse order?
df.sort_index(ascending=False)
44
Dataframes: How to sort the values in a column?
df.sort_values("col1")
45
Dataframes: how to drop rows with NA?
df.dropna()
46
What happens when a dataframe is reindexed with a missing label?
an empty row will be inserted
47
Reindex a dataframe with different dataframe?
df.reindex(df2.index)
48
How to sort the index of a dataframe?
df.sort_index()
49
Reindex dataframes: with first three months?
df=df.reindex(["Jan","Feb","Mar"])
50
Show index of a dataframe?
print(df.index)
51
With pd.read_csv: how to determine the index column?
index_col
52
Using a loop to append csv files into a dataframe?
filenames=["df1.csv","df2.csv"], dataframes=[pd.read_csv(f) for f in filenames]
53
How to import csv files with * and glob?
from glob import glob, filenames=glob("sales*.csv"), dataframes=[pd.read_csv(f) for f in filenames]
54
Dataframes: How to make an "outer" join with pd.concat and horizontally?
pd.concat([df1,df2], axis=1, join="outer")
55
Dataframes: How to make an "inner" join with pd.concat and horizontally?
pd.concat([df1,df2], axis=1, join="inner")
56
Pandas: How to sort by index values?
df.sort_index()
57
Directly on df: How to rotate x-axis by 45-degrees?
by putting "rot" into plot - df.plot(... rot=45), plt.show()
58
Directly on df: How to make a scatter plot?
df.plot(x="column1", y="column2", kind="scatter"), plt.show()
59
Directly on df: How to add a legend?
plt.legend(["title1","title2"])
60
Directly on df: How to make a histogram partially transparent?
df.hist(..., alpha=0.7)
61
How to detect missing values with True and False in a df?
df.isna()
62
How to detect missing values with True and False in a df in any column=
df.isna().any()
63
How to count the missing values in a column?
df.isna().sum()
64
How to visualize in a boxplot the number of missing values in a dataframe?
df.isna().sum().plot(kind="box"), plt.show()
65
How to remove rows with missing values in a df?
df.dropna()
66
How to fill NaN values in a df?
df.fillna()
67
Example of a very simple dictionary?
my_dict={"key1":value1, "key2":value2, "key3":value3}
68
How to access the value of a dictionary?
my_dict["key1"]
69
How to make a list of dictionaries by row and then put it into a dataframe?
list_of_dicts=[{"key1":value1, "key2":value2, "key3":value3}, {"key1":value4, "key2":value5, "key3":value6}] + pd.DataFrame(list_of_dicts)
70
How to make a dictionary of lists by column and then put it into a dataframe?
dict_of_lists={"key1":["value1","value2"], "key2":["value3","value4"], "key3":["value5","value6"]} + pd.DataFrame(dict_of_lists)
71
Directly on df: How to show a histogram with 20 bins?
df.plot(kind="hist", bins=20) + plt.show()
72
Directly on df: Bar plot with title?
df.plot(kind="bar", title="irgendwas") + plt.show()
73
Directly on df: How to make a line chart with x-axis based on a column and y-axis based on a different column?
df.plot(kind="line", x="col1", y="col2") + plt.show()
74
Direct on df: How to show a histogram?
df.plot(kind="hist") + plt.show()
75
Pandas: How to set a column as the index?
df=df.set_index("col1")
76
Pandas: How to remove an index?
df.reset_index()
77
Pandas: How to drop an index and delete the respective column?
df.reset_index(drop=True)
78
Pandas: How to use .loc to get the same result: df[df["col1"].isin(["name1","name2"])]?
df.loc[["name1","name2"]]
79
Pandas: How to use pivot_table with 0 instead of NaN for missing values and having the mean at the end of rows and columns?
df.pivot_table(values="col1", index="col2", columns="col3", fill_value=0, margins=True)
80
How to transform df.groupby(["col1","col2","col3"]).mean() into a pivot_table?
df.pivot_table(values="col1", index="col2", columns="col3")
81
Pandas: How to use multiple statistics with pivot_table?
df.pivot_table(values="col1", index="col2", aggfunc=[np.mean, np.median])
82
Pandas: How to change df.groupby("col1")["col2"].mean() into a pivot table?
df.pivot_table(values="col2", index="col1")
83
Pandas: How to group by multiple variables?
df.groupby(["col1","col2"])["col3"].mean()
84
Pandas: How to get the min, max and sum for a column based on the characteristic of another column and groupby?
df.groupby("col1")["col2"].agg([min,max,sum])
85
Pandas: How to get the mean values of a column based on the characteristics of another column and groupby?
df.groupby("col1")["col2"].mean()
86
Pandas: How to show the proportionate amount of how often a value appears in a column?
df["col1"].value_counts(normalize=True)
87
Pandas: How to count the values in a column and have the highest values first?
df["col1"].value_counts(sort=True)
88
Pandas: How to count the values in a column?
df["col1"].value_counts()
89
Pandas: How to drop duplicates based on 2 columns?
unique=df.drop_duplicates(subset=["col1","col2"])
90
Pandas: How to drop duplicates base on a column?
df.drop_duplicates(subset="col1")
91
Pandas: How to get the average of a column?
df["col1"].mean()
92
Pandas: How to get the variance of a column?
df["col1"].var()
93
Pandas: How to calculate the cumulative max?
df["col1"].cummax()
94
Pandas: How to calculate the cumulative sum?
df["col1"].cumsum()
95
Pandas: How to get the sum of a column?
df["col1"].sum()
96
Pandas: How to get the maximum value of a column?
df["col1"].max()
97
Pandas: How to merge 3 dfs at once? First two on col1 and then col2.
df.merge(df2, on="col1") \ | .merge(df3, on="col2")
98
Pandas: How to count the number of rows in col1 where something is missing?
df["col1"].isnull().sum()
99
Datetime: How to print/present a datetime object in a certain format?
datetime.datetime.strftime("%Y-%m-%d")
100
Datetime: How to transform a string into a datetime format?
datetime.datetime.strptime(date_str, "%Y-%m-%d")
101
Datetime: How to calculate the difference between datetimes?
with timedelta
102
Datetime: How to substract 1 week from datetime?
datetime.now() - timedelta(weeks = 1)
103
Dictionaries: How to ask the value of a key and not getting an error message if the value doesn't exist?
my_dict.get("key1")
104
Dictionaries: How to ask the value of a key and getting a certain message if the value doesn't exist?
my_dict.get("key1", "certain_message")
105
Dictionaries: How to delete a value from the dictionary?
del(my_dict["key1"])
106
Module to import data from Yahoo Finance?
yfinance
107
yfinance: Download of SPY and AAPL?
data = yf.download("SPY AAPL", start="2017-01-01", end="2017-04-30")
108
yfinance: Download of AAPL dividends?
apple= yf.Ticker("aapl"), apple.dividends
109
How to calculate log-returns for msft['Log_Ret'] based on msft['Close']?
msft['Log_Ret'] = np.log(msft['Close'] / msft['Close'].shift(1))
110
How to calculate volatility for msft['Volatility'] based on msft['Log_Ret']?
msft['Volatility'] = msft['Log_Ret'].rolling(252).std() * np.sqrt(252)
111
How plot the two columns msft['Close'] and msft['Volatility'] in two subplots but one graph and having for the lines the color blue and a figsize of 8,6?
msft[['Close', 'Volatility']].plot(subplots=True, color='blue', figsize=(8, 6))
112
How to time codes, i.e. how long it takes to execute them?
with %timeit or %%timeit
113
Numpy: How to create a random standard normal variable?
np.random.standard_normal()
114
Numpy: How to insert values based on values of another column?
np.where(sp500['42-252'] > SD, 1, 0)
115
Backtesting a strategy: How to determine the backtest regime (sp500['Regime']) based on column sp500['42-252'] and criteria SD?
sp500['Regime']=np.where(sp500['42-252'] > SD, 1, 0) sp500['Regime']=np.where(sp500['42-252'] < -SD, -1, sp500['Regime']) sp500['Regime'].value_counts()
116
Backtesting a strategy: How to calculate the day to day return of the strategy in the column sp500['Strategy_Returns'] based on column sp500['Regime'] and sp500['Market_Returns']?
sp500['Strategy_Returns'] = sp500['Regime'].shift(1) * sp500['Market_Returns']
117
Backtesting a strategy: How to plot the day to day returns of a strategy as well as the S&P as a cumulating sum based on the columns sp500['Strategy_Returns'] and sp500['Market_Returns']? The graph should show grids and have a figsize of 8,5.
sp500[['Strategy_Returns', 'Market_Returns']].cumsum().apply(np.exp).plot(grid=True, figsize=(8, 5))
118
String objects: How to capitalize string s?
s.capitalize()
119
String objects: How to split string s?
s.split()
120
String objects: How to find a string ("string") in the string s?
s.find('string')
121
String objects: How to count the number of occurrences of a substring in a string called s?
s.count(sub[, start[, end]])
122
String: How to separate a list of words in a string called s with "," as separator?
s.split(",")
123
String: How to replace something in a string called s?
s.replace(old, new)
124
Lists: How to replace the ith element by x in a list called l?
l[i]=x
125
Lists: How to replace every kth element from i to j - 1 by s?
l[i:j:k] = s
126
Lists: How to append the element x to the list object called l?
l.append(x)
127
Lists: How to count the number of occurences of the element object x in the list object called l?
l.count(x)
128
Lists: How to delete every kth elements in a list called l with index values i to j – 1?
del l[i:j:k]
129
Lists: How to append all elements of s to an object called l?
l.extend(s)
130
Lists: How to show the first index number of x between elements i and j - 1?
l.index(x[, i[, j]])
131
Lists: How to insert x at/before index i into a list called l?
l.insert(i, x)
132
Lists: How to remove the value at index i from a list called l?
l.remove(i)
133
Lists: How to remove an element from a list called l with index i and return it?
l.pop(i)
134
Lists: How to reverse all items in place in a list called l?
l.reverse()
135
Lists: How to sort all items in place in a list called l?
l.sort()
136
Dict: Item of d with k?
d[k]
137
Dict: Setting item key k to x?
d[k]=x
138
Dict: Deleting item with key k?
del d[k]
139
Dict: Removing all items in d?
d.clear()
140
Dict: How to make a copy of dict d?
d.copy()
141
Dict: How to check if True if k is a key?
d.has_key(k)
142
Dict: Copy of all key-value pairs of dict d?
d.items()
143
Dict: Iterator over all items of dict d?
d.iteritems()
144
Dict: Iterator over all keys of dict d?
d.iterkeys()
145
Dict: Iterator over all values of dict d?
d.itervalues()
146
Dict: Copy of all keys of dict d?
d.keys()
147
Dict: How to return and remove item key k in dict d?
d.popitem(k)
148
Dict: How to update items in d with items from d2?
d.update({d2})
149
Dict: How to show all values of dictionary d?
d.values()
150
Matplotlib: How to show grids?
plt.grid(True)
151
plt.axis: How to turn axis and lines and labels off?
plt.axis('off')
152
plt.axis: How to make equal scaling?
plt.axis('equal')
153
plt.axis: How make equal scaling via dimension changes?
plt.axis('scaled')
154
plt.axis: How to make all data visible (tightens limits)?
plt.axis('tight')
155
plt.axis: How to make all data visible (with data limits)?
plt.axis('image')
156
plt.axis: How to determine minimum and maximum values of x and y?
xmin, xmax, ymin, ymax or ylim and xlim
157
matplotlib: How to determine title of a graph?
plt.title()
158
Matplotlib: How to determine x and y labels?
plt.xlabel and plt.ylabel
159
Matplotlib: How make a graph with data y and with red circles?
plt.plot(y, 'ro')
160
Matplotlib: How to position the legend?
plt.legend(loc=0)
161
Matplotlib: How to make a second y-axis?
fig, ax1 = plt.subplots() | ax2 = ax1.twinx()
162
Matplotlib: How to address a subplot?
plt.subplot(numrows, numcols, fignum)
163
Matplotlib: How to plot two values in a Histogram with data y(having two columns), with the labels "1st" and "2nd" and 25 bins next to each other?
plt.hist(y, label=['1st','2nd'], bins=25)
164
Show the index of a dataframe?
df.index
165
Show the columns of a dataframe?
df.columns
166
Pandas: creating a DatetimeIndex object for 1.1.2015 with frequency months and occurence 9 times?
pd.date_range('2015-1-1', periods=9, freq='M')
167
Python module to import data via URL?
from urllib.request import urlretrieve
168
How to use urlretrieve to save a txt file from the web locally?
url='https://www.stoxx.com/document/Indices/Current/HistoricalData/h_vstoxx.txt' urlretrieve(url, './data/vstoxx.txt')
169
How to remove all blanks in a txt file?
lines=open('./data/vstoxx.txt', 'r').readlines() | lines=[line.replace(' ','') for line in lines]
170
How to fill in a dataframe the NA values with the last available values from the data series?
data.fillna(method='ffill')
171
How to make a regression with xdat and ydat?
pd.ols(y=ydat, x=xdat)
172
How to calculate the correlation between two variables in df "rets" having the columns .SPX and .VIX?
rets.corr()
173
How to import a csv file (data.csv) and read all lines?
csv_file=open(path + 'data.csv' + 'r') | content = csv_file.readlines()
174
How to use the arange function of NumPy to generate minutes data from 2020-01-01 10:00:00 to 2020-12-31 10:00:00)?
np.arange('2020-01-01 10:00:00', '2020-12-31 10:00:00', dtype='datetime64[m]')
175
How to turn the grid on in a graph?
plt.grid(True)
176
How to create 10 random numbers?
[random.random() for _ in range(10)]
177
How to create 10 random numbers between 1 and 5?
[random.randint(1,5) for _ in range(10)]
178
How to make a random amount of letters of the alphabet and then count which letter occurs how many times?
``` import random import string lc=string.ascii_lowercase rc=[random.choice(lc) for _ in range(10000)] c={c:rc.count(c) for c in lc} ```
179
Pandas: How to copy/insert a website into a variable?
pd.read_html(url)
180
If share price is in df data: how to normalize the share price and plot it in a 8,6 figsize?
(data/data.ix[0]*100).plot(figsize=(8, 6))
181
If share price is in df data: How to calculate log-returns?
np.log(data/ data.shift(1))
182
How to calculate from a df of daily log-returns a covariance matrix?
rets.cov()*252
183
How to make a colorbar for a scatter plot with the label "Sharpe ratio"?
plt.colorbar(label='Sharpe ratio')
184
Pandas: How to read data from an Excel file from a workbook called "workbook.xlsx" and sheet called "first_sheet"?
df=pd.read_excel(path + 'workbook.xlsx', 'first_sheet', header=None)
185
Pandas: How to write a df into an Excel sheet called "new_book_1.xlsx" and into "my_sheet"?
df.to_excel(path + "new_book_1.xlsx", "my_sheet")
186
How to import matplotlib module?
import matplotlib.pyplot as plt
187
How to create figure and axis object in matplotlib?
fig, ax = plt.subplots()