Confirmatory Latent class analysis Flashcards
Describe the example described in the lecture
Various weights are added at various distances on a scale, children are asked which way the scale will tip. According to Siegler’s theory, they will respond in one of 5 ways according to their stage of development:
•Rule I: choose the side with the largest weight
•Rule II: choose the side with the largest weight, if the weights are the
same choose the side with the largest distance
•Rule III: if both weights and distances differ, guess
•Rule IV: choose the side with the largest sum of the weight and
distance
•Rule V: choose the side with the largest product of the weight and
distance.
We want to build a latent class model to predict which rule they use based on their answer responses. Predicted responses for each stage were drafted up in order to test the classification. See docs for these predicted responses
How is the process of CLCA differet to a regular LCA?
You restrict parameters of the full LCA model
In what ways do you restrict parameters of the full LCA model?
Equality constraints • E.g., conditional probabilities are equal within classes (specify certain probabilities to be identical; within a class or across classes depending in what you're interested in.)
Fixed value constraints
• E.g., conditional probabilities are equal to .25 in one class (guessing in a 4 choice item)
- Instead of estimating this parameter you insert a theoretical number for this parameter
Linear constraints • E.g., (probability of correct in class 1) = 1 –(probability correct in class 2) -Specifying that one probability depends in a linear way on another probability. Makes the most sense when its 1 - another probability
If, with an equality constraint, you specify three parameters to be the same how many parameters are you estimating?
You’re always specifying 2 or more parameters to be the same, if you’re specifying three parameters to be the same you’re still estimating one parameter
If you specify these parameters what effect does this have on the model? (attractive quality)
The model becomes more parsimonious due to the reduction in parameters
How would you carry out an unrestricted LCA on the following dataset?
dat = as.matrix(red.table(“conservation.txt))
source(“lca.r”)
res = lca(dat,2)
res
What does the following table display?
Item 1 Item 2 Item 3 Item 4 Class 1 b b b b Class 2 c c c c
It displays equality constraints; all the b parameters are the same and all the c parameters are the same. So all the parameters are fixed within a class, equal over items.
How would you fix this equality constraints model in R?
parameters with the same integer are restricted to be equal
restr= matrix(c(
1,1,1,1,
2,2,2,2),2, 4, T)
restr
res_restr = lca(dat, 2, restr = restr)
res_restr #to get the results, conditional probabilities should be equal across items per class
How many parameters are in the unconstrained model and the model with the equality constraints?
No constraints: Npar = 9 (4 x2 cond. probability parameters) + (2 - 1 class probability parameters) (BIC = 794.87 LL = -373.60)
Equality constraints: Npar = 3 (1 x2 cond. probability parameters) + (2 - 1 class probability parameters) (BIC = 772.79 LL = -378.45)
Indicates that the constraints improve the model
Give an example of how you could carry out fixed value constraints on this model conceptually (in terms of the table)
Class 1 0 0 0 0
Class 2 a b c d
You could set the cond. prob to 0 for the first class if you expect that those in class 1 (eg non-conservers) will fail on these items for sure. You could also set this to a guessing value (e.g 0.5; 0.25)
How would you set up a fixed value constrained model in R?
To tell the programme that 0 should be fixed to 0 we need another matrix
restr= matrix(c(
0,0,0,0, #Just means parameter is fixed
1,2,3,4),2, 4, T) #Unique values for each item
restr
init = matrix(c(
.5,0,0,0,0,
.5,.8,.8,.8,.8),2, 5, T)
init
#The init matrix contains the class sizes and parameter values for both class one and two
#If it is set to 0, the programme will not attempt to estimate them, the .8 values are arbitrary however; the MLE will begin at .8 and move from there to the estimated values. Values of .5 are usually safe
res_fix = lca(dat, 2, restr = restr, init = init) res_fix #to get the results, conditional probabilities in class 1 should be set to 0 and those in class 2 should vary
How many parameters are in the constrained and unconstrained model? (fixed value constraint)
No constraints:
Npar = 9
(BIC = 794.87
LL = -373.60)
Fixed value constraint:
Npar = 5
(BIC = 870.05
LL = -421.78)
Seems to indicate that the constraints make the model worse
Give an example of how you could carry out a linear constraint on this model conceptually (in terms of the table)
Class 1 1-a 1-b 1-c 1-d
Class 2 a b c d
In this case the values of class one are specified at the proportion of errors in class two per item
How would you set up a linear constrained model in R?
restr= matrix(c(
-1,-2,-3,-4, #Just means parameter is fixed
1,2,3,4),2, 4, T) #Unique values for each item
restr
res_lin = lca(dat, 2, restr = restr) res_lin #to get the results, conditional probabilities of class 1 should be equal to 1 - the conditional probabilities of class 2
The parameters labelled with a number and a minus in front of it (e.g -2) are constraint to be equal to 1 - the parameter with the same number without the minus (e.g 2)
How many parameters are in the linearly constrained model and the unconstrained model?
No constraints:
Npar = 9
(BIC = 794.87
LL = -373.60)
Linear constraints:
Npar = 5
(BIC = 778.42
LL = -375.96)
This suggested that the constraints improved the model fit
How can you mix up these constraints?
For example a linear and equality constraint could look like:
Class 1 1-a 1-a 1-a 1-a
Class 2 a a a a
What types of error rates do the linear constraint model and the linear & equality constraint model have?
linear constraint model: Item specific error rate (error rate varies per item)
linear & equality constraint model: Item fixed error rate (error rate varies per item)
How would you set up a linear & equality constrained model in R?
The programme might flip the labels of the classes so be careful
restr= matrix(c(
-1,-1,-1,-1, #Just means parameter is fixed
1,1,1,1),2, 4, T) #Unique values for each item
restr
res_lin = lca(dat, 2, restr = restr)
res_lin
How many parameters do the unconstrained, linear constrained and liear and equality constrained models hold?
No constraints:
Npar = 9
Linear constrains:
Npar = 5
Linear and equality constraints:
Npar = 2
If the following statistics are observed, what can you conclude?
No constraints:
Npar = 9
(BIC = 794.87
-2LL = -373.60)
Linear constrains:
Npar = 5
(BIC = 778.42
-2LL = -375.96)
Linear and equality constraints:
Npar = 2
(BIC = 767.51
-2LL = -378.46)
No constraints:
Npar = 9
(BIC = 794.87
-2LL = -373.60)
Linear constrains:
Npar = 5
(BIC = 778.42
-2LL = -375.96)
Linear and equality constraints:
Npar = 2
(BIC = 767.51
-2LL = -378.46)
The BIC indicates that the L&E model is the best fit given the number of parameters. The likelihood ratio test specifically tests for the addition of the extra constraint. In this case it is insignificant and therefore the L&E model is preferred.
If the following statistics are observed, what can you conclude?
No constraints:
Npar = 9
(BIC = 794.87
-2LL = -373.60)
Linear constrains:
Npar = 5
(BIC = 778.42
-2LL = -375.96)
Linear and equality constraints:
Npar = 2
(BIC = 767.51
-2LL = -378.46)
No constraints:
Npar = 9
(BIC = 794.87
-2LL = -373.60)
Linear constrains:
Npar = 5
(BIC = 778.42
-2LL = -375.96)
Linear and equality constraints:
Npar = 2
(BIC = 767.51
-2LL = -378.46)
The BIC indicates that the L&E model is the best fit given the number of parameters. The likelihood ratio test specifically tests for the addition of the extra constraint. In this case it is insignificant and therefore the L&E model is preferred.
WHat is classification concerned with in LCA?
Given the parameter estimates of you LCA model, you can estimate class membership.
Give two possible notations for posterior class probabilities
Posterior class probabilities Probability that someone with responses Xp1, ..., Xpn belongs to a class. 𝑃 (𝜃𝑝 =1 |𝑋𝑝1,...,𝑋𝑝𝑛) 𝑃 (𝜃𝑝 =2 |𝑋𝑝1,...,𝑋𝑝𝑛) or 𝜋|𝑋|𝐴𝐵𝐶𝐷, 𝑡𝑖𝑗𝑘𝑙|
aka probability they belong to a class given a certain output pattern
How to you calculate the posterior probability in classification
According to Bayes rule:
𝑃 (𝐵| 𝐴) =𝑃 (𝐴&𝐵) / 𝑃(𝐴)
=𝑃 (𝐴|𝐵) ×𝑃(𝐵) / 𝑃(𝐴)
Here B is 𝜃𝑝 and A is a given response pattern (e.g, Xp=[1 0 0 1 1]), i.e.,
𝑃 (𝜃𝑝 =1 | [10011] ) = 𝑃 (𝑋𝑝= [10011] |𝜃𝑝=1) ×𝑃(𝜃𝑝=1) / 𝑃(𝑋=[10011])
𝑃 (𝜃𝑝 =2 | [10011] ) = 𝑃 (𝑋𝑝= [10011] |𝜃𝑝=1) ×𝑃(𝜃𝑝=2) / 𝑃(𝑋=[10011])
You know how to calculate both numerator and denominator already, they depend on the conditional probabilities and class probabilities!