Introduction to Data Science Part 1 Flashcards

1
Q

What is Data Science?

A

A process of using data to understand different things and uncover insights using scientific tools like programming and statistics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the key objectives of Data Science?

A

Extract knowledge from data, uncover insights, and make informed decisions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are common terms associated with Data Science?

A

Big Data, Machine Learning, Artificial Intelligence, Data Mining, Predictive Analytics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the types of data in Data Science?

A

Qualitative (descriptive data) and Quantitative (measurable values).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the three main data formats?

A

Structured, Unstructured, and Semi-structured data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are examples of Structured Data?

A

Relational databases, spreadsheets, and data tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are examples of Unstructured Data?

A

Images, videos, social media posts, and PDFs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are examples of Semi-structured Data?

A

JSON, XML, and HTML documents.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the major sources of data?

A

Web data, financial transactions, online trading, social networks, business records.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Big Data?

A

Data that is expensive to manage and difficult to extract value from.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the 5 V’s of Big Data?

A

Volume, Velocity, Variety, Veracity, and Value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does ‘Volume’ refer to in Big Data?

A

The size of data being generated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does ‘Velocity’ refer to in Big Data?

A

The speed at which data is processed and analyzed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does ‘Variety’ refer to in Big Data?

A

Different types of data sources and formats.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does ‘Veracity’ refer to in Big Data?

A

Data quality and reliability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does ‘Value’ refer to in Big Data?

A

The potential business benefits derived from analyzing data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is Machine Learning?

A

A field of AI that enables systems to learn and improve from experience without explicit programming.

18
Q

What are the three main types of Machine Learning?

A

Supervised Learning, Unsupervised Learning, and Reinforcement Learning.

19
Q

What is the goal of Supervised Learning?

A

To learn a mapping from inputs to outputs using labeled data.

20
Q

What is the goal of Unsupervised Learning?

A

To find patterns or structure in data without labeled responses.

21
Q

What is Reinforcement Learning?

A

A type of learning where an agent interacts with an environment to maximize cumulative reward.

22
Q

What is AI (Artificial Intelligence)?

A

The simulation of human intelligence processes by machines, including learning, reasoning, and self-correction.

23
Q

What is the difference between Data Science and Machine Learning?

A

Data Science produces insights, while Machine Learning produces predictions.

24
Q

What are the main application areas of Data Science?

A

Industrial processes, business, text data, image data, and medical data applications.

25
What are some industrial applications of Data Science?
Fault prediction, preventive maintenance, demand forecasting, inventory management, price optimization.
26
What are some business applications of Data Science?
Market trend analysis, churn analysis, credit risk modeling.
27
What are some text data applications of Data Science?
Sentiment Analysis, Topic Modeling, Conversational AI.
28
What are some image data applications of Data Science?
Computer Vision, Machine Vision.
29
What are some medical applications of Data Science?
Disease diagnosis, patient data analysis, medical imaging analysis.
30
What is the CRISP-DM process?
A standard for data mining with phases: Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, Deployment.
31
What are the six phases of CRISP-DM?
Business understanding, Data understanding, Data preparation, Modeling, Evaluation, Deployment.
32
What is the TDSP (Team Data Science Process)?
A methodology developed by Microsoft for structuring data science projects.
33
What are the key steps in a Data Science project?
Problem definition, Data Collection, Data Processing, Model Building, Model Evaluation, Deployment.
34
What key questions does Data Science aim to answer?
What is the problem? What data is needed? Where does data come from? How should data be processed? How should models be evaluated?
35
What are common ways to visualize data for insights?
Charts, graphs, heatmaps, scatter plots, histograms.
36
What are key qualities of a good Data Scientist?
Inquisitive, knowledgeable, proficient in machine learning, statistics, and probability, skilled in coding, and strong in domain knowledge.
37
What coding skills should a Data Scientist have?
Python, R, SQL, and tools for data processing like Pandas, NumPy, and Scikit-Learn.
38
Why is domain knowledge important for a Data Scientist?
It helps in interpreting data correctly and making meaningful insights relevant to the industry.
39
What are some emerging research topics in Data Science?
Big Data Modeling, AI Ethics, Fairness in Machine Learning, Explainable AI, Edge Computing.
40
What are some challenges in Data Science?
Data privacy concerns, data bias, computational complexity, data storage and management.
41
What is the impact of Data Science in healthcare?
Predicting diseases, personalizing treatments, improving patient care, optimizing hospital operations.