Lecture 7 Flashcards

1
Q

what is a factor variable?

A

a particular type of variable that has 2 or more levels. each of the level acts as a particular category. each level has its own associated label.

eg:
VCE, uni, none

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

can factor variables be treated with numerical operations? eg: divide by 10

A

even though factor variables may be given numeric expression (1,2,3) they cannot be treated by numeric operations (will produce ‘NA’ indicating non meaningful result)

unless you put ‘as.numeric’ –> to convert factor variable to numeric format

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is dummy coding for

A

transform a categorical variable with g categories into a meaningful set of g-1 dummy variables

dummy variables would either have values of 0 or 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

“dummy variables would either have values of 0 or 1” true or false?

A

true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

if you have 10 categories, what is the maximum number of dummy variables you can have?

A

10-1 = 9

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is contrast () command for

A

indicate a particular factor variable we would like to create dummy coding on

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

why is intercept in dummy variable regression analysis meaningful while in arbitrary matrix it’s not?

A

intercept = the predicted score on DV when individual scores a 0 on ALL IVs in the linear equation

and say in this dummy variable, the non dummy-variabled group (scored 0) is RMHI. hence the intercept corresponds to the predicted MEAN score of people in the RMHI (the variable that is given ‘0’ dummy coding)

INTERCEPT REPRESENT THE MEAN OF THE REFERENCE CATEGORY

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what does regression coefficient in dummy coding analysis table correspond to?

A

the difference of the mean of the IV1 (which is the variable given ‘1’ in the dummy coding) and IV2 (which is the variable given ‘0’ in the dummy coding)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

if:

  • in linear analysis intercept is 14 and regression coefficient it 0.1
  • ARMP is the dummy variable (given ‘1’ in dummy coding)
  • RMHI is the reference variable (given ‘o’ in the dummy coding)

what can you indicate (in terms of statistics) from these info?

A
  • the estimated mean for RMHI is 14 –> intercept
  • the estimated mean for ARMP is 14+0.1 = 14.1
  • the difference between mean RMHI and mean ARMP is 0.1 –> regression coefficient
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is regression coefficient?

A

the expected change in DV for 1 unit change in IV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

why does regression coefficient value represent in dummy variable?

A

dummy variable ONLY has value of 0 OR 1

hence, the regression coefficient represents the expected change from 0 to 1 (1 UNIT), which is a change on DV from RMHI (reference category) to ARMP (the dummy variable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are alternatives to dummy coding

A

contr. SAS - reverse dummy coding: make the last level the reference category instead of the 1st level
contr. helmert - use of negative and positive values integers (instead of just 0 and 1), and when summed up across each dummy variable, would add up to 0. this approach compares each level to the average of its PREVIOUS LEVELS.

and the dummy variable is not labelled the name, it’s labelled by the dummy coding itself (numerical, eg: 1, 2, 3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what does one-way analysis mean

A

only one group classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what does between-subject mean

A

groups are independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is an omnibus approach?

A
  • assume that MEANS of all groups are the SAME.
  • unfocussed, not informative RQ because if there is a difference, we dont know which direction..
  • omnibus approach can only identify an INCONSISTENCY between data and the assumption that all means are the same

eg:

  • is there a diff in statistical self efficacy according to prior experience in maths among RMHI and ARMP students?
  • At least one unidentified group mean is different from all remaining group means
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is a focussed approach?

A

provides identifiable differenced
can explain everything in omnibus approach

eg:

  • is there a diff among RMHI and ARMP between those with no experience in maths and those who have done either VCE or Uni?
  • is there a diff between students with no experience in maths and those with uni maths experience among RMHI and ARMP students?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

“is there a diff between students with no experience in maths and those with uni maths experience among RMHI and ARMP students?” omnibus or focussed?

A

there is a specific way that the difference may occur… (no experience and yes exp)

if omnibus, it would just be is there a diff in self efficacy in maths according to prior experience… (no mention of groups assessed)

18
Q

“set of contrast will add up to 0” true or false?

A

true

19
Q

requirements in specifying own set

A
  1. individual values are negative, zero, or positive
  2. all values sum up to 0
  3. total number of contrast s is ONE LESS THAN the number of groups
20
Q

is we have 5 groups, how many contrasts can we have?

A

4

21
Q

we have 5 groups, hence 4 contrasts. how many values would there be in each contrast?

A

5 –> the no of groups would be the no of values in each contrast

22
Q

“inferences would be consistent irrespective of how values in contrasts are presented. (aka doesnt matter if in decimal, integer or fractions form)” true or false.

A

true

23
Q

what is orthogonal contrast

A

similar to the concept of independent aspects between groups.

most contrast is orthogonal (uncorrelated)

24
Q

what does it mean if a design is balanced and coefficients in a pair of contrasts are orthogonal?

A

means that mean difference in each contrast do not overlap and contain redundancy

25
Q

how do you know if contrasts are orthogonal?

A

by multiplying them together and summing the products up.

  • should sum up to 0
  • nonzero results means non orthogonal
26
Q

assumptions for 3 group mean differences analysis (with contrast)

A
  1. independence of observation
  2. normality of observed scores
  3. homogeneity in variances
27
Q

in standardised mean differences for 2 independent groups, what happens to the CI if our data is not normal?

A

if data is not normal, regardless of whether it’s balanced or not, the CI wont be robust!

28
Q

when can we rely on hedges g CI?

A

when our data is NORMAL and there is EQUAL VARIANCE

29
Q

when can we rely on bonett’s delta CI?

A

when data is NORMAL but there is UNEQUAL VARIANCE

30
Q

what CI can we rely on when our data is normal and has equal variance?

A

hedges g CI

31
Q

what CI can we rely on when our data is normal but has unequal variance?

A

bonett’s delta CI

32
Q

The sample means of three groups are as follows:

  • Grp1 = 10,
  • Grp2 = -4,
  • Grp3 = 6.

If the third group is defined as the reference category in a linear regression model using two dummy variables, which will be the value of the partial regression coefficient for the dummy variable assigned to Grp2?

A

The dummy variable assigned to Grp2 will have a value of either 0 or 1.

A value of 0 on this dummy variable corresponds to the mean on the DV for the reference category (defined in Q5 to be Grp3’s mean equal to +6).

The regression coefficient given by the dummy variable assigned to Grp 2 therefore must equal the difference between the respective means for Grp2 and Grp3; i.e., –4 – (+6) = –10.

33
Q

If a between-subjects factor contains five levels, how many non-redundant planned comparisons can be specified for that factor?

A

4

34
Q

If there are 7 distinct groups in one categorical variable, and 3 distinct groups in a second categorical variable, how many more unique differences can be examined for the former categorical variable compared to the latter?

A

7-3 = 4

this qn is not asking the number of dummy variables for the variable with 7 distinct groups.

35
Q

what is mixed group analysis for

A

This term is used when a design for investigating group differences contains both within-subjects and between-subjects factors.

this term is a term used when investigating mean differences?

36
Q

what does it mean to have smaller differences in group means

A

smaller differences in group means will result in a smaller BETWEEN-GROUP sum of squares

37
Q

what does it mean to have larger group variances

A

larger group variances will result in a larger WITHIN-GROUP sum of square

38
Q

why the analysis of group differences can be equally understood in terms of analysis of variance and linear regression using dummy variables?

A

Because both approaches investigate the extent to which variation on a dependent variable can be accounted for by variation in group means.

39
Q

why the values of {1, 2, 3, 4, 5} assigned to a categorical variable with five categories cannot be directly used in a linear regression?

A
  • Because the regression coefficients will not be meaningful.
  • Because the sum of squares in the ANOVA Table will be incorrect.
  • Because the observed R-square value will differ, depending on which category is assigned to which value.
40
Q

If a null hypothesis for group mean differences among the five means in Q8 proposed that the first two groups differed from the last three groups, what would be the contrast weights?

  1. [+2, +2, -3, -3, -3]
  2. [+1, +1, -1, -1, -1]
  3. [+3, +3, -2, -2, -2]
  4. [+1, +1, -1.5, -1.5, -1.5]
A
  1. [+3, +3, -2, -2, -2]

This is correct because the sum of the positive weights (i.e., +6) equals the sum of the negative weights (-6) in absolute terms, and the values reflect the relative size of the two sets (one containing 2 groups, and the other containing 3 groups).