Introduction to Data Science Part 1 Flashcards
What is Data Science?
A process of using data to understand different things and uncover insights using scientific tools like programming and statistics.
What are the key objectives of Data Science?
Extract knowledge from data, uncover insights, and make informed decisions.
What are common terms associated with Data Science?
Big Data, Machine Learning, Artificial Intelligence, Data Mining, Predictive Analytics.
What are the types of data in Data Science?
Qualitative (descriptive data) and Quantitative (measurable values).
What are the three main data formats?
Structured, Unstructured, and Semi-structured data.
What are examples of Structured Data?
Relational databases, spreadsheets, and data tables.
What are examples of Unstructured Data?
Images, videos, social media posts, and PDFs.
What are examples of Semi-structured Data?
JSON, XML, and HTML documents.
What are the major sources of data?
Web data, financial transactions, online trading, social networks, business records.
What is Big Data?
Data that is expensive to manage and difficult to extract value from.
What are the 5 V’s of Big Data?
Volume, Velocity, Variety, Veracity, and Value.
What does ‘Volume’ refer to in Big Data?
The size of data being generated.
What does ‘Velocity’ refer to in Big Data?
The speed at which data is processed and analyzed.
What does ‘Variety’ refer to in Big Data?
Different types of data sources and formats.
What does ‘Veracity’ refer to in Big Data?
Data quality and reliability.
What does ‘Value’ refer to in Big Data?
The potential business benefits derived from analyzing data.
What is Machine Learning?
A field of AI that enables systems to learn and improve from experience without explicit programming.
What are the three main types of Machine Learning?
Supervised Learning, Unsupervised Learning, and Reinforcement Learning.
What is the goal of Supervised Learning?
To learn a mapping from inputs to outputs using labeled data.
What is the goal of Unsupervised Learning?
To find patterns or structure in data without labeled responses.
What is Reinforcement Learning?
A type of learning where an agent interacts with an environment to maximize cumulative reward.
What is AI (Artificial Intelligence)?
The simulation of human intelligence processes by machines, including learning, reasoning, and self-correction.
What is the difference between Data Science and Machine Learning?
Data Science produces insights, while Machine Learning produces predictions.
What are the main application areas of Data Science?
Industrial processes, business, text data, image data, and medical data applications.