Transforming Data Flashcards
(10 cards)
Function used to find missing data?
df.isna()
Function used to compile all the missing data
df.isna().sum()
Function used to drop all rows that are missing data?
df.dropna(inplace=True)
Using inplace=True makes the change stick
Function used just to drop only rows in one column?
df. dropna(subset=[‘Embarked’], inplace=True)
df. isna().sum()
Function used to drop column directly
df. drop(columns= ‘Cabinet, inplace=True)
df. isna().sum()
Function used to drop all columns that are missing any data?
df.dropna(axis=1, inplace=True
What argument can be used to decide how much incomplete data to drop in columns?
thresh= can be used to drop data with less than 45% of its data
df.dropna(axis=1, thresh=.45, inplace=True)
How to fill missing data with new category? What code is used?
df[‘Gender’].fillna(‘Missing, inplace=True)
Replaces Gender column with Missing
How to fill missing data with an average?
median_age = df[‘Age’].median()
df[‘Age’].fillna(median_age, inplace=True)
df.isna().sum()
How to fill categorical data with the most common value in the column?
most_common_pet = df[‘Pet Type’].mode()
df[‘Pet Type].fillna(most_common_pet, inplace=True