DS Foundations Part 3 Flashcards
(25 cards)
What is data storytelling?
Combining data, visuals, and narrative to explain insights effectively.
Why is it important to tailor your data presentation to your audience?
Different audiences require different levels of detail and framing.
What is the difference between exploratory and explanatory analysis?
Exploratory is for discovery; explanatory is for communicating findings.
What are common pitfalls in presenting data?
Cherry-picking, misleading axes, unclear labels.
What makes a data story effective?
Clarity, relevance, emotional engagement, and actionability.
What is a metric?
A quantifiable measure used to assess performance or behavior.
What is the difference between a metric and a KPI?
All KPIs are metrics, but not all metrics are key to strategic goals.
Why is it risky to optimize a single metric?
It can lead to unintended consequences or ignore tradeoffs.
What is Goodhart’s Law?
When a measure becomes a target, it ceases to be a good measure.
What are the main steps in EDA?
Understand data types, check distributions, handle missing/outliers, explore relationships.
What is the role of data cleaning in EDA?
To prepare reliable inputs for meaningful exploration.
Why visualize before modeling?
To detect patterns, relationships, and assumptions that affect modeling.
What is feature scaling?
Rescaling values to a common range to improve model behavior.
What is normalization?
Transforming data to fit within a fixed range, typically [0,1].
What is standardization?
Rescaling data to have a mean of 0 and standard deviation of 1.
Why is log transformation useful?
To reduce skew or compress range in positively skewed data.
How can you detect outliers?
Using IQR, z-scores, visual inspection (boxplots, scatterplots).
When should you keep outliers?
If they reflect real phenomena and are not errors.
When should you remove outliers?
If they result from data entry error or measurement artifacts.
What is robustness in data analysis?
The extent to which results hold under different assumptions or inputs.
What is sensitivity analysis?
Testing how results change with variations in assumptions or parameters.
What is a sanity check?
A simple check to ensure results make sense before deeper analysis.
What is data literacy?
The ability to read, understand, and communicate data effectively.
Why does organizational data maturity matter?
It affects how well data can be used for decisions and innovation.