Session 3 | Statistical Testing and Correlation Flashcards

(15 cards)

1
Q

What is the difference between Parametric Tests and Non- parametric tests?

A

Parametric tests assume that data is normally distributed and have roughly the same variance.

Non parametric tests come in handy when the sample is too small or when we are using ordinal data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why are non-parametric tests used for small samples?

A

Large samples they have the benefit of the Theorem of the large numbers that indicate that If I get a big enough sample I will likely get a normally distributed result. By default, small samples might contain outliers and be skewed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a t-test? Provide me with an example of paired T-test and independent t-test.

A

T-tests compares the difference among group means. They have one categorical variable (nominative) and one predictor.

Paired T-Test: eg. effect of two prep program on average given students come from the same class.

Independent T-tests: eg. effect of two prep programs on average score given they are from different schools.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When do I use ANOVA/MANOVA?

A

We use ANOVA/MANOVA when we calculate the difference between the group means of multiple groups.

ANOVA is used when I am looking at one specific outcome> average pain levels after 3 different painkillers.

ANOVA I have one outcome.

MANOVA is used when I look into multiple outcomes. > How does species affect the petal lengh, the stem etc.

MANOVA has more outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When do I use Correlation and when do I use Regression? Provide me with examples.

A

We use correlation when we simply want to describe the relationship between two variables, positively or negatively correlated and how strong. While Regressions are usually used to explain causality.

We can prove this using exam score and study hours. While in correlation we describe that exam scores generally grow with a higher study hour, the regression says if u tell me your study hours, I can tell you your exam score.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a test statistic?

A

A test statistic indicates how much the relationship between two variables are close to the borderline null-case. Basically it calculates how much two group results differ between them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a p-value?

A

The p-value indicates the probability of noticing so much difference or relationship in case there is nothing happening.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When can I infer a statistically significant hypothesis based on the test statistics?

A

When my test statistics (describes relationship of the two variable compared to null) is bigger than the statistics of the null hypothesis itself.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the difference between the p-value and the Pearson’s r?

A

The Pearson r value indicates how strong a relationship between two variables is (typically in a correlation). The p-value indicates how likely it is to see this strong r value if we are in the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When do I know that there is a statistical significance based on the p-value?

A

low p-value> if we are in the null hypothesis it is very unlikely = what you are seeing makes sense and you are not crazy for thinking it.

high p-value > it is likely to have this correlation in the null hypothesis = you are seeing patterns that mean nothing

This means that if p-value is under 0.05 then it is statistically SIGNIFICANT. > Less than a 5% chance to get this score and still be in the null-hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Which types of Relationships Tests or Regression tests are there? Can you give me an example?

A

Simple linear regression
< effect of x on y?
Multiple linear regression
< effect or x and z on y
Logistic regression
< effect of x in obtaining either a or b (binary thing)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are measures of Central Tendency?

A
  • Mean + Median +Mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are measures of variability?

A
  • Range
  • Variance
  • Standard Deviation
  • Quartiles
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What measures to define the shape of a curve?

A

Skewness and Kurtosis indicate the shape of a curve.

A positive Kurtosis indicates a higher curve, a negative kurtosis a more spread and less concentrated amount of elements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

When do we use the Pearson correlation (r) and the Spearman Rank Correlation?

A

Pearson correlation > indicates the intensity and direction of relationships. Used for LINEAR RELATIONSHIPS.

SPearman Rank > used for ordinal data and non-linear relationships. (+1 consistent ranking 0 no consistent pattern -1 negative consistent ranking)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly