Module 6 Data Science Flashcards
(50 cards)
What is the fundamental distinction between different types of data?
The distinction between quantitative and qualitative data is crucial for effective data science.
Define quantitative data.
Numerical data that can be measured and compared using mathematical methods.
Give examples of quantitative data.
- Age
- Height
- Weight
- Salary
Define qualitative data.
Descriptive data that cannot be measured using mathematical methods.
Give examples of qualitative data.
- Opinions
- Emotions
- Perceptions
What is the primary use of quantitative data?
To draw conclusions and make predictions based on numerical analysis.
What is the primary use of qualitative data?
To understand experiences and perspectives.
What are the two categories of data collection methods?
- Active
- Passive
What is Manual Active data collection?
Involves actively engaging with participants to collect data, such as surveys and interviews.
What is Manual Passive data collection?
Involves collecting data without actively engaging with participants, such as observation.
What is Computerised Active data collection?
Using technology to actively collect data from participants or customers.
What is Computerised Passive data collection?
Collecting data automatically through technology, without engaging with participants.
Define primary data.
Data collected by the researcher themselves.
Give examples of primary data.
- Diaries
- Original documents
- Government documents
Define secondary data.
Data obtained from existing sources.
Give examples of secondary data.
- Journal articles
- Textbooks
- Encyclopedia websites
What are the four criteria for evaluating data quality?
- Relevance
- Accuracy
- Validity
- Reliability
What does relevance refer to in data quality?
The extent to which the data is directly related to the research question.
What does accuracy refer to in data quality?
The degree to which the data reflects the true values of the variables being measured.
What does validity refer to in data quality?
The degree to which the data accurately measures what it is supposed to measure.
What does reliability refer to in data quality?
The consistency of the data over time and across different sources.
What are errors in data collection?
Errors can arise in collection, entry, or processing.
What is uncertainty in data?
A lack of precision or ambiguity.
What are limitations in data?
Inherent constraints of the data, such as sample size or coverage.