exam 2 Flashcards

(67 cards)

1
Q

explain mean / pros+cons of it

A

mean = average
pro = quick idea about the data
con = sensitive to extreme values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

explain median / pro

A

median = middle value of distribution
pro = immune to extreme values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

explain mode / pro

A

mode = frequency of most recurring value
pro = gets an idea as to which answer is most popular in dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

standard deviation / what will get you a bigger standard deviation

A

looks at how much on average is each value in the data set differing from the mean

if people are far from the mean it will be a bigger standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

calculate range / what does a large range show about the data set

A

max value in data set - min value

a large range shows that people are giving low and high values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what type of hypothesis testing should you use in this example: On a scale from 1-7 how much do you like the brand?

A

one sample t-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

why would you use a one sample t-test?

A

when you are only dealing with one variable, it allows you to get some insight on just one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how do you analyze results from a one sample t-test?

A

by comparing the midpoint of the scale (ex: 4 on a 1-7 scale) with the mean of peoples responses to see if people fall away from midpoint or not

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what types of hypothesis testing techniques do we use if we are dealing with 2 variables in cause and effect relationships?

A

multivariate techniques

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what type of hypothesis testing should you use in this example: Are students or faculty more likely to use the internet?

what are the dependent and independent variable(s)?

A

independent sample t-test

independent variable = students or faculty

dependent variable = how much they use the internet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

when would we use an independent sample t-test?

A

use when comparing two groups of responses to each other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what type of hypothesis testing should you use in this example:

Hypothesis: in communities where air pollution is a problem (vs not a problem), consumers are more likely to purchase an EV.

what are the independent variable(s)? dependent variable(s)?

A

independent sample t-test

Independent = air pollution presence (categorical because just answer yes or no)

dependent = likelihood of purchasing an EV (continuous because it could be measured on a 1-7 scale)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

when running an independent sample t-test, how should i analyze the results? step by step

A
  1. look at “two sided p value” section. is it less than 0.05? if so, we can conclude that there is a significant difference between the two groups
  2. find the mean. which one is less? if the p value was significant, the lower value is more likely to do whatever it is
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what type of hypothesis testing should you use in this example:

how much do you like strawberry ice cream? how much do you like vanilla ice cream?

A

paired sample t-test

because we are comparing strawberry and vanilla on the bases of two continuous scales because this SHOULD be 1-7 scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

when should you use an independent sample t-test ?

A

when we compare 2 groups of responses to each other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

when do we use a paired sample t-test?

A

When we need to compare means of two scores

when we need to compare two variables in the data which should be measured on a continuous scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what type of hypothesis testing should you use in this example:

lets predict people purchase an EV to help the environment more than to save money on gas.

what questions should we ask?

A

paired sample t-test

questions: 1. measuring whether people purchase an EV to save on gas 2. measures whether people purchase an EV to help the environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

how do we conduct a paired sample t-test analysis? step by step

A
  1. look at if one mean of one variable is higher than the other
  2. confirm the difference by looking at p value. is this less than 0.05?
  3. if so, we can confirm that the lower value of means = more likely to do that thing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

when do you use an ANOVA? what types of variables to the X and Y need to be?

A

use when you compare two or more groups. your independent variable has to be categorical and dependent variable needs to be continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what type of what type of hypothesis testing should you use in this example:

different age groups and their frequency is using disney plus subscriptions

why?

A

ANOVA because you are measuring a categorical independent (age) with a continuous dependent (frequency of use)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

how do you analyze an ANOVA table to get results? step by step

A
  1. look at the means. are there any categories that have higher means than others?
  2. look at significance. is it below 0.05? if so, we see that there is a difference between the groups.
  3. if there is a difference between groups means, you need to prove it.
  4. look at follow up Scheffe and Ad Hoc tests
  5. find the “significance” section under the multiple comparisons table. are there any significance values below 0.05? if so, we can conclude that there is a significant difference between those groups. if the sig value is over 0.05, then there is no sig difference between those two groups
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

explain linear relationships

A

association between two variables wherein the strength and nature of the relationship remains the same over the range of both variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

explain curvilinear relationships

A

a relationship between two variables wherein the strength/direction of the relationship changes over the range of both variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

when would we use a correlation?

A

when we are examining the relationship between two non categorical variables (through interval or ratio variables)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
what are three things to look for when assessing correlation/regression?
1. does the relationship exist 2. what direction is the relationship in? (Negative or positive) 3. strength of the relationship
26
when using correlation/regression, how can we tell if a relationship exists ?
if the sig value is less than 0.05, there is some relationship here
27
how can we find out the direction of the relationship in a correlation/regression?
look at the sign of the coefficient if it is positive, then x and y are both increasing if it is negative, either y is increasing while x is decreasing or vice versa
28
how do you discover the strength of a correlation/regression relationship?
through the value of the coefficient if the pearson correlation coefficient is closer to 1, the relationship is stronger. if it is closer to 0, the relationship is weak
29
what is the pearson correlation coefficient?
measuring the relationship between two continuous variables a statistical measure of the strength of a linear relationship between two continuous (interval or ratio) variables
30
what values does the pearson correlation coeff. go between?
Value is going to vary between -1 and +1
31
what does a pearson correlation coeff. of +0.8 mean?
the relationship between two continuous variables is strong
32
what does a pearson correlation coeff. of +0.2 mean?
there is a weak positive correlation between two continuous variables
33
what does a pearson correlation coeff. of -0.5 mean?
-0.5 indicates a moderate negative linear relationship between two continuous variables.
34
given this example, what test would we run and what steps would we need to take in order to analyze if the satisfaction with the service is related to a cosumer's likelihood of recommending it? : suppose we are looking at the disney plus video streaming service and trying to assess whether the satisfaction with the service is related to a consumers likelihood of recommending it.
we would run a pearson correlation coefficient because these are both continuous variables steps: 1. look @ sig value. is it less than 0.05? if so, there is a relationship between the variables 2. find the pearson correlation coeff. is it positive? if so, we can say that there is a positive relationship between the variables 3. is the relationship strong? we will assess this if the coeff. is closer to 1. if it is closest to -1 then it is not strong.
35
when do you use substantive sig? what type of coefficient do we look at to check for substantive sig?
if there was a weak and significant relationship in a correlation/regression. Look at r^2 (or the coefficient of determination) to check for substantive significance
36
what is r^2 called?
coefficient of determination
37
if the coeff. of determination is large, what does this say about the linear relationship?
The larger the size of the coefficient of determination, the more meaningful the linear relationship between the two variables being examined is
38
what would you say about the meaningfulness of the relationship between variables if r^2 = 0.60?
if r ^2 = 0.6, we would say that 60% of the variation in one variable is associated with the other variable of interest. So the relationship would be meaningful
39
what is r^2 the square of?
the pearson correlation coeff.
40
what range does r^2, or coeff. of determination, vary from?
0.0 - 1.00
41
r^2 def
a number measuring the proportion of variation in one variable accounted for by another
42
if r is 0.8, what is r^2?
0.64
43
if you have an r^2 of 0.64, is this relationship of substantive significance practically useful? explain
Since we got 0.64 = 64%, this means that 64% of the variation in one variable is being explained by the other variable. This is pretty strong which means that one variable (independent) is having a pretty large influence on the other (dependent)
44
if i had a value of 0.2 for my correlation coefficient, what would be the r^2 value? what does that value mean? is this relationship of substantive significance practically useful?
0.2 x 0.2 = 0.04. This means that only 4% of the variation in one variable is being explained by the other. Therefore, the weaker the correlation, the smaller the amount of variance in one variable can be explained by the other variable. example: Even if i make people super satisfied with my product, its only going to have 4% of the impact on their likelihood to recommend.
45
Correlation ____ completely tell us whether there is a cause and effect
cannot
46
explain x and y relationship with a correlation
in a correlation, x is associated with y
47
explain x and y relationship with temporal priority
cause must come before the effect, x must come before the y ex: studying whether using disney + more often (x) causes high satisfaction (y). to prove this, you must show how people started using disney + more before they reported higher satisfaction
48
explain x and y relationship with a non-spuriousness
the correlation between x and y is not a result of an outside confounding variable
49
why use regression? what does it tell me?
it predicts values. if i change one variable, it tells me how much of a change i can expect in the other
50
make a regression out of this: Example: say we were looking at something like the effect of price on satisfaction
If i change price by $1, how much can i expect satisfaction to change?
51
with regressions you need to focus on ___
slope
52
what does slope tell you?
the relationship between x and y
53
what does a flat slope mean in a linear regression?
means that if I were to change x by a lot, y is not going to change that much. The flatter the line, if i keep going towards the right, and keep increasing values of x, y will barely change.
54
what does a steep slope mean in a linear regression?
As the line becomes steeper, the influence of x on y starts to become large So whenever you see higher values for the slope, it means that x is having a very large influence on y
55
the steeper the slope, the larger the effect _ has on _
x has on y
56
why is line of best fit good? what does it tell you
For all of your data points, you want to minimize the distance between the data point and the line that youre drawing between them
57
what kind of test should i run on this example? How the square feet of a house affects the price of a house what are the independent / dependent variables?
X independent variable = square feet Y dependent variable = price linear regressions
58
steps for analyzing linear regression table:
go to coefficients table 1. find sig value. is it less than 0.05? if so, we can conclude that x (independent) has a significant effect on y (dependent) 2. find b coeff. is it positive? if so, we can conclude that if we were to increase the independent variable by 1 unit, it should result in an increase in the dependent variable 3. look @ b value again. if you were to increase ind. by 1 unit, change in dep. variable will be the b value
59
what type of testing do we do when we have more than one ind. variable in your regression?
multiple regression
60
steps for analyzing multiple regression tables
1. look @ ANOVA table. is the sig value less than 0.05? if so, the regression is significant. 2. go to coefficients table. find the sig values for both independent variables. are they less than 0.05? if so, then they are significant. if they are not below 0.05, then the ind. variable does not have an effect on the dependent variable.
61
Unstandardized Coefficients
they tell you how much the dependent variable changes when one independent variable increases by 1 unit, while holding other variables constant.
62
when do we standardize our scales? give an example
When we have two variables that are on different scales and cannot necessarily be compared against each other (ex: we may have number of rooms AND square feet of the house which are not comparable)
63
why do we standardize coefficients ?
allows us to see which one has a larger influence on the Y
64
The larger the standardized coefficient, the larger the ___.
impact
65
to reach more specific conclusions, do you want to look at standardized or unstandardized coefficients?
unstandardized
66
when do we use Binary logistic regression
May use when independent variable is continuous but your dependent variable is categorical
67
can you give an example of when we would use binary logistic regression?
Say we were looking at how price affects peoples tendency to purchase and purchase is measured as a yes or no. Continuous variable of price ranging between $100-$200 Categorical variable of tendency to purchase as only yes or no