CRO Principles Flashcards by Josh Kellett

What is a strong CRO Hypothesis? (Hint: Structure)

A testable statement predicting an outcome: Changing [Element X] into [Variation Y] for [Audience Segment Z] will result in [Impact on Metric K] because [Rationale based on Data/Insight].

How well did you know this?

Not at all

Perfectly

What is the primary purpose and interpretation of an A/A Test?

Purpose: Validate testing tool setup & methodology understand inherent variance (noise). Interpretation: Run identical versions; expect non-significant results for the primary metric approx. 95% of the time (at 95% confidence). SRMs should be checked.

How well did you know this?

Not at all

Perfectly

What is the core difference/advantage of a Bayesian approach in A/B testing?

Calculates full probability distributions for metrics yielding intuitive results like P(B>A) (Probability B is better than A) and Expected Loss. Allows for potentially faster decisions (with caution) compared to fixed-horizon frequentist tests.

How well did you know this?

Not at all

Perfectly

What is Sequential Testing in experimentation? (Concept & Benefit)

Method allowing continuous analysis during a test using statistical stopping boundaries (for significance or futility) while controlling Type I/II error rates. Benefit: Can significantly reduce average test duration vs fixed-horizon tests.

How well did you know this?

Not at all

Perfectly

What are Multi-armed Bandit algorithms in CRO? (Concept & Use Case)

Algorithms that dynamically allocate more traffic to better-performing variations during exploration balancing learning (explore) vs maximizing immediate return (exploit). Use Case: Often for short-term optimizations (e.g. headlines) or personalization.

How well did you know this?

Not at all

Perfectly

What is Statistical Significance (p-value)?

Probability the observed difference (or a larger one) between variations occurred purely by random chance assuming the null hypothesis (no real difference) is true. Threshold (alpha) often 0.05.

How well did you know this?

Not at all

Perfectly

What is Confidence Level?

Represents the level of certainty that a test result isn’t due to random chance. If you set a 95% confidence level, you accept a 5% risk (alpha) of concluding there’s a difference when there really isn’t one (a false positive).

How well did you know this?

Not at all

Perfectly

What is a Confidence Interval?

An estimated range of values (e.g. for uplift or conversion rate) calculated from sample data likely to contain the true population parameter at a specified confidence level. Indicates precision of the estimate.

How well did you know this?

Not at all

Perfectly

What determines required Sample Size in testing?

Key factors include: Baseline Conversion Rate (BCR) Minimum Detectable Effect (MDE) desired Statistical Power (1-beta) and Significance Level (alpha). Also variance for continuous metrics.

How well did you know this?

Not at all

Perfectly

What is Statistical Power? (Definition & Implication)

The probability (1 - beta) of correctly rejecting the null hypothesis when it is false (i.e. detecting a real effect if it exists). Implication: Low power increases the risk of Type II errors (false negatives). Typically set at 80%+.

How well did you know this?

Not at all

Perfectly

What is Minimum Detectable Effect (MDE)?

The smallest effect size (e.g. relative uplift) that a test is designed to reliably detect with the specified Power and Significance levels. A crucial input for sample size calculation based on business needs.

How well did you know this?

Not at all

Perfectly

What is a Type I Error (False Positive)? (Definition & Risk Control)

Incorrectly rejecting a true null hypothesis (claiming a difference exists when it doesn’t). Risk is controlled by the significance level (alpha e.g. 0.05).

How well did you know this?

Not at all

Perfectly

What is a Type II Error (False Negative)? (Definition & Risk Control)

Failing to reject a false null hypothesis (missing a real difference when it exists). Risk (beta) is controlled by Statistical Power (Power = 1 - beta).

How well did you know this?

Not at all

Perfectly

What is Regression to the Mean & its implication in testing?

Statistical tendency for extreme results on initial measurements to be closer to the average on subsequent measurements. Implication: Distrust unusually large effects seen early in a test; wait for sufficient data.

How well did you know this?

Not at all

Perfectly

What is the ‘Multiple Comparisons Problem’ in experiment analysis?

Analyzing multiple metrics or segments increases the overall probability of making at least one Type I error (false positive) purely by chance. Requires adjustments (e.g. Bonferroni FDR) or pre-specification.

How well did you know this?

Not at all

Perfectly

What is Simpson’s Paradox & its risk in CRO?

Study These Flashcards

A trend appears in different groups of data but disappears or reverses when the groups are combined. Risk: Aggregate A/B test results can mislead if segments with different baseline rates are disproportionately represented across variations.

What are Novelty & Learning Effects in testing? (Definition & Mitigation)

Study These Flashcards

Novelty: Initial user reaction (positive or negative) to change itself. Learning: Time users take to adapt to a change. Mitigation: Run tests long enough for effects to stabilize; segment by user tenure; monitor metrics over time.

What is Twyman’s Law in data analysis?

Study These Flashcards

“Any figure that looks interesting or different is usually wrong.” Implication: Be highly skeptical of surprising or outlier data points/results; rigorously investigate potential errors in tracking setup or analysis.

What is the challenge with Network Effects / Interference in testing?

Study These Flashcards

Occurs when one user’s experience affects another’s (e.g. social platforms marketplaces) violating the A/B test assumption of independent observations (SUTVA). Requires alternative designs like cluster or switchback randomization.

What’s a key consideration for testing Non-Binary Metrics? (e.g. AOV RPU)

Study These Flashcards

Metrics like AOV or RPU often aren’t normally distributed. Requires appropriate statistical tests (e.g. t-test variants if assumptions met or non-parametric tests like Mann-Whitney U) and careful handling of outliers and variance.

What is Sample Ratio Mismatch (SRM)?

Study These Flashcards

When the observed ratio of users/sessions assigned to variations significantly deviates from the intended ratio (e.g. not 50/50). Indicates a potential issue with the randomization or data collection process invalidating results.

How are Jobs-to-be-Done (JTBD) interviews used in CRO research?

Study These Flashcards

To uncover underlying user goals (functional social emotional) informing value propositions and identifying unmet needs/opportunities for optimization beyond surface-level features or interactions.

What key insights does Session Recording analysis provide for CRO?

Study These Flashcards

Reveals specific user journeys identifies friction points (hesitation rage clicks U-turns) visualizes interaction with dynamic elements and provides qualitative context for drop-offs seen in quantitative funnel analysis.

What are Heatmaps/Clickmaps used for in CRO research?

Study These Flashcards

To visualize aggregate user attention (heatmaps) and interaction patterns (clickmaps - including ‘dead clicks’ on non-interactive elements) revealing what users see ignore and attempt to engage with on a page.

What is the goal of Form Analysis in CRO?

To identify specific points of friction within online forms by tracking metrics like field completion rates time spent per field correction/error rates (field validation issues) and overall abandonment points to optimize form completion.

What is Data Triangulation and why use it in CRO research?

Combining findings from multiple independent research methods (qual + quant e.g. analytics + session recordings + surveys) to validate insights deepen understanding and increase confidence in hypotheses before committing testing resources.

What is Insight Synthesis in the CRO research process?

The critical step of analyzing interpreting and connecting data points from various research methods to identify meaningful patterns user problems and actionable opportunities that form the basis for strong evidence-backed hypotheses.

Why distinguish Correlation vs. Causation in CRO analysis?

Correlation (two metrics move together) doesn't imply one causes the other. CRO uses experimentation to establish causation (change X *causes* change Y) avoiding wasted effort optimizing based on spurious or misinterpreted correlations.

Why track Micro-conversions in CRO?

Tracking smaller steps towards a macro-conversion (e.g. add-to-cart PDF download) provides visibility into the user journey helps identify specific bottlenecks earlier in the funnel and allows optimization of intermediate engagement points.

What is the purpose of advanced Experiment Prioritization Frameworks?

To systematically score and rank test ideas beyond basic ICE/PIE incorporating factors like potential impact (evidence-based) confidence (data quality) effort (dev/design) strategic fit and downstream effects (e.g. PXL Opportunity Scoring).

Why integrate Qualitative Research deeply into CRO?

To uncover the 'why' behind user behavior observed in quantitative data generate stronger data-informed hypotheses understand user motivations/frictions and provide context for interpreting test results.

What defines a strong Experimentation Culture?

Organizational buy-in for testing valuing learning over 'winning' data-driven decision making psychological safety to test and fail efficient processes transparency and knowledge sharing.

CRO Principles Flashcards

(32 cards)