Data 101 Flashcards

(11 cards)

1
Q

What is data preparation and cleaning?

A

The process of transforming raw data into a more structured, readable, and reliable format for analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the importance of handling missing data?

A

It ensures a dataset’s accuracy and reliability, preventing unreliable and non-representative analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What can result from a large volume of missing values in a dataset?

A

Unreliable and non-representative analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why is error correction important in data science?

A

To ensure data is accurate, unbiased, and reliable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What types of errors are particularly concerning when correcting data?

A

Zeros or incorrectly large values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the significance of dealing with outliers in data?

A

It ensures the data remains representative and prevents skewed results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the role of standardization and normalization in data preparation?

A

To ensure all data is on the same scale for consistency and accurate analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the purpose of removing duplicates in data preparation?

A

To optimize storage space and preserve data and model accuracy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does data munging or wrangling involve?

A

Converting data into a user-friendly format, such as a relational table or CSV file.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Fill in the blank: The key steps of preparing and cleaning data include handling missing data, error correction, dealing with outliers, standardization and normalization, removing duplicates, and _______.

A

data munging or wrangling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Name the 6 main steps of data cleaning

A

Handling Missing Data Error correction Dealing with Outliers Standardization and Normalization Removing Duplicates Data Munging or Wrangling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly