Lesson 2: Basic data exploration Flashcards

1
Q

P—– is the primary tool data scientists use for exploring and manipulating data

A

Pandas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The most important part of the Pandas library is the D—-F—

A

DataFrame

A DataFrame holds the type of data you might think of as a table. This is similar to a sheet in Excel, or a table in a SQL database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How would you get the data in the file path “path-houses” into a DataFrame called “df-houses”

A

df-houses = pandas.read_csv(path-houses)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How can you get a summary of the data held in the “df-houses” DataFrame

A

df-houses.describe()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is standard deviation?

A

Step 1: Find the mean.
Step 2: For each data point, find the square of its distance to the mean.
Step 3: Sum the values from Step 2.
Step 4: Divide by the number of data points.
Step 5: Take the square root.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly