Terms_and_Definitions Flashcards

(65 cards)

1
Q

A/B Test

A

A method of comparing two versions of a webpage, feature, or app against each other to determine which performs better.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Null Hypothesis (H₀)

A

Assumes there is no significant difference between the control and test groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Alternative Hypothesis (H₁)

A

Assumes there is a significant difference between the control and test groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

P-Value

A

The probability of observing results at least as extreme as those measured, assuming the null hypothesis is true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Significance Level (α)

A

The threshold for rejecting the null hypothesis (commonly set at 0.05).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Confidence Interval (CI)

A

A range of values that is likely to contain the true effect size or metric with a given level of confidence (e.g., 95%).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Control Group

A

The group that does not receive the treatment or variant being tested.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Test Group

A

The group that receives the treatment or variant being tested.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Randomization

A

Assigning participants to groups in a way that each participant has an equal chance of being in any group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Power Analysis

A

A calculation to determine the minimum sample size required to detect a given effect size with sufficient power.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Effect Size

A

The magnitude of the difference between groups (e.g., a 5% increase in conversion rate).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Type I Error

A

Incorrectly rejecting the null hypothesis (false positive).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Type II Error

A

Failing to reject the null hypothesis when it is false (false negative).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Bonferroni Correction

A

A method to adjust significance levels when multiple comparisons are being made.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Simpson’s Paradox

A

A trend appears in different groups of data but disappears or reverses when the groups are combined.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Descriptive Statistics

A

Summarizing and describing the features of a dataset (e.g., mean, median, mode).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Inferential Statistics

A

Using a sample to make generalizations about a population (e.g., hypothesis testing, confidence intervals).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Mean

A

The average value of a dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Median

A

The middle value in a dataset when ordered.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Mode

A

The most frequently occurring value in a dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Variance

A

A measure of how much values in a dataset vary from the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Standard Deviation

A

The square root of the variance, representing data dispersion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Z-Test

A

A hypothesis test for comparing means when the population variance is known.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

T-Test

A

A hypothesis test for comparing means when the population variance is unknown.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
ANOVA (Analysis of Variance)
A test to compare the means of three or more groups.
26
Chi-Square Test
A test for relationships between categorical variables.
27
Linear Regression
A method to model the relationship between a dependent variable and one or more independent variables.
28
Logistic Regression
A regression model used when the dependent variable is categorical.
29
Bayesian Statistics
An approach to statistics that incorporates prior beliefs or evidence.
30
Frequentist Statistics
A traditional approach to statistics based on frequency or proportion.
31
SELECT
A SQL command used to retrieve data from a database.
32
FROM
Specifies the table to retrieve data from.
33
WHERE
Filters rows based on conditions.
34
GROUP BY
Groups rows sharing a property for aggregation.
35
HAVING
Filters grouped rows based on aggregated values.
36
JOIN
Combines rows from two or more tables based on a related column.
37
INNER JOIN
Returns rows with matching values in both tables.
38
LEFT JOIN
Returns all rows from the left table and matching rows from the right table.
39
RIGHT JOIN
Returns all rows from the right table and matching rows from the left table.
40
OUTER JOIN
Returns all rows from both tables, with nulls where no match exists.
41
ORDER BY
Sorts the result set by specified columns.
42
LIMIT
Restricts the number of rows returned in a query.
43
Subquery
A query nested within another query.
44
CTE (Common Table Expression)
A temporary result set used within a SQL query.
45
Pandas
A library for data manipulation and analysis.
46
NumPy
A library for numerical computations.
47
Matplotlib
A library for creating static visualizations.
48
Seaborn
A library for statistical data visualization.
49
Scipy.stats
A library for statistical functions and tests.
50
Statsmodels
A Python module for statistical modeling and hypothesis testing.
51
A/B Test Simulation
A process to mimic test results using random sampling or bootstrapping.
52
Data Visualization
Representing data graphically to communicate insights.
53
Dashboard
A visual interface that displays key performance metrics and data.
54
Power BI
A business analytics tool for creating dashboards and visualizations.
55
Tableau
A software tool for data visualization and business intelligence.
56
Funnel Analysis
A method to track user journey and identify drop-off points.
57
Cohort Analysis
Analyzing behavior by grouping users based on shared characteristics.
58
Customer Journey
The path a customer takes from initial interaction to conversion.
59
Clickstream Data
Data collected about user interactions on a website or app.
60
Hadoop
A framework for distributed storage and processing of large datasets.
61
Telemetry
The collection of data about the usage of a digital product.
62
Data Pipeline
A series of steps to process and analyze data from source to destination.
63
Hypothesis Validation
The process of testing assumptions with data.
64
Exploratory Data Analysis (EDA)
Initial analysis to summarize data characteristics.
65
ETL (Extract, Transform, Load)
A process for collecting, transforming, and storing data.