Panda Theoretical Questions. Flashcards

(10 cards)

1
Q

What is the primary purpose of the Pandas library in Python, as described in the presentation?

A

Pandas is an open-source library used for data manipulation and analysis, particularly for numerical tables and time series, built on top of NumPy (Page 12).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Who started developing the Pandas library, and in which year did the development begin?

A

Wes McKinney started developing Pandas in 2007 (Page 12).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Name two limitations of NumPy that Pandas addresses.

A

NumPy lacks built-in support for labeled data and cannot handle mixed data types, while Pandas provides labeled indexing and supports heterogeneous data types (Page 9).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the difference between a Pandas Series and a DataFrame?

A

A Series is a one-dimensional array with index labels, while a DataFrame is a two-dimensional labeled data structure with columns of potentially different types, like a spreadsheet (Page 15).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does the value_counts() function do when applied to a Pandas Series?

A

It returns a Series containing counts of unique values in the Series (Page 16).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does the pd.concat() function differ from pd.merge() in terms of functionality?

A

pd.concat() combines DataFrames along rows or columns without requiring a common column, while pd.merge() performs SQL-like joins based on a common column (Pages 29–30).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does the dropna() function do when called without parameters on a DataFrame?

A

It removes all rows containing any NaN values from the DataFrame (Page 26).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the purpose of the set_index() and reset_index() methods in a DataFrame?

A

set_index() sets a column as the DataFrame’s index, while reset_index() moves the index back to a column and assigns default integer indices (Page 22).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What types of data sources can be used to create a Pandas DataFrame?

A

Lists, lists of lists, dictionaries, lists of tuples, CSV files, Excel files, SQL files, arrays, and Series objects (Pages 17–18).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What information does the info() method provide about a DataFrame?

A

It provides a summary of the DataFrame, including column names, data types, non-null counts, and memory usage (Page 20).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly