Pandas Flashcards

(67 cards)

1
Q

What method is used to subset rows by index label in Python?

A

loc

loc is used to access a group of rows and columns by labels or a boolean array.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How does Python count rows?

A

From 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What method is used to get the second row in a DataFrame?

A

iloc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does using -1 with iloc do?

A

Gets the last row

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What syntax is used to subset columns in Python?

A

Colon (:)

A colon is used to refer to all rows when subsetting columns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you subset the first column using loc?

A

df.loc[:, [columns]]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How can you select the last column using iloc?

A

-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the method to calculate the average life expectancy by year?

A

Split data by year and calculate mean of ‘lifeExp’ column

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What method can be used to flatten a DataFrame?

A

reset_index

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What function is used to get counts of unique values on a Pandas Series?

A

nunique

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a histogram?

A

Vertical bar chart of frequencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What type of graph is a frequency polygon?

A

Line graph of frequencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does an ogive represent?

A

Line graph of cumulative frequencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What type of chart provides proportional representation for categories of a whole?

A

Pie Chart

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the methods of visual presentation of data?

A
  • Table
  • Graphs
  • Pie Chart
  • Multiple bar chart
  • Simple pictogram
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a frequency distribution?

A

A summary of how often different values occur in a dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the cumulative frequency?

A

The running total of frequencies up to a certain class interval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does a Pareto chart display?

A

Frequency of categories in descending order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the principle of excellent graphs regarding data distortion?

A

The graph should not distort the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What should the scale on the vertical axis of a graph begin with?

A

Zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is considered ‘chart junk’?

A

Unnecessary adornments in a graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

True or False: All axes in a graph should be properly labeled.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the simplest possible graph used for?

A

To represent a given set of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is a graphical error related to compressing the vertical axis?

A

Misleading representation of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Fill in the blank: The method to create a frequency polygon is to plot the __________ against the class intervals.
Frequency
26
What is the purpose of a scatter plot?
To show the relationship between two variables
27
What should a good presentation of data avoid?
Graphical errors
28
What is the command to install the pandas library using pip?
pip install pandas
29
True or False: Pandas is primarily used for data manipulation and analysis in Python.
True
30
Fill in the blank: To load a CSV file into a pandas DataFrame, you would use the function ___.
pd.read_csv()
31
What is the primary data structure used in pandas?
DataFrame
32
How do you access the first five rows of a DataFrame called 'df'?
df.head()
33
What method would you use to view the last three rows of a DataFrame?
df.tail(3)
34
True or False: You can access a column in a DataFrame using the dot notation.
True
35
What is the command to access the 'Age' column from a DataFrame named 'df'?
df['Age']
36
What function would you use to select rows based on a condition?
df[df['column_name'] condition]
37
How can you subset a DataFrame to include only rows where the 'Salary' is greater than 50000?
df[df['Salary'] > 50000]
38
What does the .iloc method do in pandas?
It allows indexing and selecting by integer position.
39
How do you select the first row of a DataFrame using .iloc?
df.iloc[0]
40
True or False: You can slice a DataFrame using .loc and .iloc.
True
41
What is the syntax to access a specific cell at row index 2 and column 'Name'?
df.at[2, 'Name']
42
Fill in the blank: To select multiple columns, you can pass a list to the DataFrame like this: df[___].
['column1', 'column2']
43
What is the command to load an Excel file into a pandas DataFrame?
pd.read_excel()
44
How do you rename a column in a DataFrame?
df.rename(columns={'old_name': 'new_name'}, inplace=True)
45
True or False: Pandas can handle missing data.
True
46
What command would you use to check for missing values in a DataFrame?
df.isnull().sum()
47
What method is used to drop rows with missing values?
df.dropna()
48
How do you select rows with index labels 1 to 3 using .loc?
df.loc[1:3]
49
What does the .shape attribute return?
It returns a tuple representing the dimensionality of the DataFrame.
50
Fill in the blank: To filter a DataFrame based on multiple conditions, you can use ___ operators.
logical
51
What is the syntax to select the 'Name' and 'Age' columns from a DataFrame?
df[['Name', 'Age']]
52
True or False: You can use the .query() method to filter DataFrames using a query string.
True
53
What do you use to reset the index of a DataFrame?
df.reset_index()
54
What function is used to concatenate two DataFrames?
pd.concat()
55
How can you access a specific row by its index using .loc?
df.loc[index]
56
What is the difference between .loc and .iloc?
.loc is label-based, while .iloc is position-based.
57
Fill in the blank: The command to save a DataFrame to a CSV file is df.to___('filename.csv').
csv
58
What is the method to group data in a DataFrame?
df.groupby()
59
How do you access rows where the 'Department' is 'Sales'?
df[df['Department'] == 'Sales']
60
True or False: You can use .apply() to apply a function along an axis of the DataFrame.
True
61
What is the purpose of the .sort_values() method?
It sorts the DataFrame by the specified column(s).
62
How do you select a specific subset of rows and columns in a DataFrame?
df.loc[row_indices, ['column1', 'column2']]
63
What is the command to get descriptive statistics of a DataFrame?
df.describe()
64
Fill in the blank: You can create a new column in a DataFrame by assigning to df['___'].
new_column
65
What does the .info() method provide?
It provides a summary of the DataFrame including the data types and non-null counts.
66
How do you filter a DataFrame to include only unique values in a column?
df['column_name'].unique()
67
What command would you use to drop a specific column from a DataFrame?
df.drop('column_name', axis=1, inplace=True)