1 Descriptive and inferential statistics Flashcards

(16 cards)

1
Q

What is Statistics?

A

The science of collecting, analyzing, interpreting, presenting, and organizing data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two main branches of statistics?

A

Descriptive Statistics and Inferential Statistics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Descriptive Statistics?

A

Methods for organizing, summarizing, and presenting data in an informative way (e.g., calculating means, creating graphs). It describes the features of a dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Inferential Statistics?

A

Methods used to draw conclusions or make predictions about a larger population based on data collected from a smaller sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define Population and Sample.

A

Population: The entire group of individuals, objects, or measurements of interest.

Sample: A subset or portion of the population selected for study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are Quantiles, Quartiles, and Percentiles used for?

A

They are used for describing data by dividing a probability distribution or a sample into continuous intervals with equal probabilities or observations.

Quantiles: General term for points dividing the data range.

Quartiles: Divide the data into four equal parts (Q1, Q2/Median, Q3).

Percentiles: Divide the data into 100 equal parts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a Boxplot (or Box-and-Whisker Plot)?

A

A graphical representation of the distribution of a dataset based on its five-number summary: minimum, first quartile (Q1), median (Q2), third quartile (Q3), and maximum. It helps visualize central tendency, spread, and identify potential outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are Frequency and Density Distributions?

A

Frequency Distribution: A table or graph showing how often different values or ranges of values occur in a dataset.

Density Distribution: Represents the distribution of continuous data, often shown as a smoothed curve (like a PDF). The area under the curve represents probability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are Distribution Histograms and Curves used for?

A

They are visual tools used to represent the shape, center, and spread of a dataset’s distribution. Histograms use bars for frequency counts in intervals; curves (like density curves) provide a smoothed representation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the basic concept of Probability?

A

The measure of the likelihood that a specific event will occur. It’s expressed as a number between 0 (impossibility) and 1 (certainty).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why are Theoretical Distributions used?

A

They are used to model and describe random phenomena or variables. They provide a mathematical function that approximates the probability distribution of real-world data (e.g., Normal, Binomial, Poisson distributions).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define PDF, CDF, and ICDF.

A

PDF (Probability Density Function): For continuous variables, describes the relative likelihood for a random variable to take on a given value. The area under the PDF curve between two points gives the probability of the variable falling within that range.

CDF (Cumulative Distribution Function): Gives the probability that a random variable is less than or equal to a specific value ‘x’.

ICDF (Inverse Cumulative Distribution Function / Quantile Function): Given a probability ‘p’, it returns the value ‘x’ such that P(X ≤ x) = p. It’s the inverse of the CDF.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the Normal Distribution?

A

A continuous probability distribution that is symmetrical and bell-shaped. It’s defined by its mean (μ) and standard deviation (σ). Many natural phenomena approximate this distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are Standard Scores (Z-scores) and Standardization?

A

Standard Score (Z-score): Measures how many standard deviations a specific data point is away from the mean of its distribution. Formula: Z = (X - μ) / σ.

Standardization: The process of converting data points to Z-scores, resulting in a distribution with a mean of 0 and a standard deviation of 1. This allows comparison of scores from different distributions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a QQ Plot (Quantile-Quantile Plot)?

A

A graphical tool used to assess if a dataset follows a particular theoretical distribution (often the Normal distribution). It plots the quantiles of the dataset against the quantiles of the theoretical distribution. If the points fall approximately on a straight line, the data likely follows that distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the purpose of Normality Tests?

A

Formal statistical procedures (like Shapiro-Wilk or Kolmogorov-Smirnov tests) used to determine whether there is significant evidence to reject the hypothesis that a dataset comes from a normally distributed population.