Pandas Flashcards
(67 cards)
What method is used to subset rows by index label in Python?
loc
loc is used to access a group of rows and columns by labels or a boolean array.
How does Python count rows?
From 0
What method is used to get the second row in a DataFrame?
iloc
What does using -1 with iloc do?
Gets the last row
What syntax is used to subset columns in Python?
Colon (:)
A colon is used to refer to all rows when subsetting columns.
How do you subset the first column using loc?
df.loc[:, [columns]]
How can you select the last column using iloc?
-1
What is the method to calculate the average life expectancy by year?
Split data by year and calculate mean of ‘lifeExp’ column
What method can be used to flatten a DataFrame?
reset_index
What function is used to get counts of unique values on a Pandas Series?
nunique
What is a histogram?
Vertical bar chart of frequencies
What type of graph is a frequency polygon?
Line graph of frequencies
What does an ogive represent?
Line graph of cumulative frequencies
What type of chart provides proportional representation for categories of a whole?
Pie Chart
What are the methods of visual presentation of data?
- Table
- Graphs
- Pie Chart
- Multiple bar chart
- Simple pictogram
What is a frequency distribution?
A summary of how often different values occur in a dataset.
What is the cumulative frequency?
The running total of frequencies up to a certain class interval.
What does a Pareto chart display?
Frequency of categories in descending order
What is the principle of excellent graphs regarding data distortion?
The graph should not distort the data
What should the scale on the vertical axis of a graph begin with?
Zero
What is considered ‘chart junk’?
Unnecessary adornments in a graph
True or False: All axes in a graph should be properly labeled.
True
What is the simplest possible graph used for?
To represent a given set of data
What is a graphical error related to compressing the vertical axis?
Misleading representation of data