Confirmatory Latent class analysis Flashcards by oisin mcelwain

Describe the example described in the lecture

Various weights are added at various distances on a scale, children are asked which way the scale will tip. According to Siegler’s theory, they will respond in one of 5 ways according to their stage of development:

•Rule I: choose the side with the largest weight
•Rule II: choose the side with the largest weight, if the weights are the
same choose the side with the largest distance
•Rule III: if both weights and distances differ, guess
•Rule IV: choose the side with the largest sum of the weight and
distance
•Rule V: choose the side with the largest product of the weight and
distance.

We want to build a latent class model to predict which rule they use based on their answer responses. Predicted responses for each stage were drafted up in order to test the classification. See docs for these predicted responses

How well did you know this?

Not at all

Perfectly

How is the process of CLCA differet to a regular LCA?

You restrict parameters of the full LCA model

How well did you know this?

Not at all

Perfectly

In what ways do you restrict parameters of the full LCA model?

Equality constraints
• E.g., conditional probabilities are equal within classes (specify certain probabilities to be identical; within a class or across classes depending in what you're interested in.)

Fixed value constraints
• E.g., conditional probabilities are equal to .25 in one class (guessing in a 4 choice item)
- Instead of estimating this parameter you insert a theoretical number for this parameter

Linear constraints
• E.g., (probability of correct in class 1) = 1 –(probability correct in class 2)
-Specifying that one probability depends in a linear way on another probability. Makes the most sense when its 1 - another probability

How well did you know this?

Not at all

Perfectly

If, with an equality constraint, you specify three parameters to be the same how many parameters are you estimating?

You’re always specifying 2 or more parameters to be the same, if you’re specifying three parameters to be the same you’re still estimating one parameter

How well did you know this?

Not at all

Perfectly

If you specify these parameters what effect does this have on the model? (attractive quality)

The model becomes more parsimonious due to the reduction in parameters

How well did you know this?

Not at all

Perfectly

How would you carry out an unrestricted LCA on the following dataset?

dat = as.matrix(red.table(“conservation.txt))

source(“lca.r”)
res = lca(dat,2)
res

How well did you know this?

Not at all

Perfectly

What does the following table display?

         Item 1 Item 2 Item 3 Item 4 Class 1       b         b         b         b Class 2      c         c          c         c

It displays equality constraints; all the b parameters are the same and all the c parameters are the same. So all the parameters are fixed within a class, equal over items.

How well did you know this?

Not at all

Perfectly

How would you fix this equality constraints model in R?

parameters with the same integer are restricted to be equal

restr= matrix(c(
1,1,1,1,
2,2,2,2),2, 4, T)
restr

res_restr = lca(dat, 2, restr = restr)
res_restr #to get the results, conditional probabilities should be equal across items per class

How well did you know this?

Not at all

Perfectly

How many parameters are in the unconstrained model and the model with the equality constraints?

No constraints:
Npar = 9 (4 x2 cond. probability parameters) + (2 - 1 class probability parameters)
(BIC = 794.87
LL = -373.60)

Equality constraints:
Npar = 3 (1 x2 cond. probability parameters) + (2 - 1 class probability parameters)
(BIC = 772.79
LL = -378.45)

Indicates that the constraints improve the model

How well did you know this?

Not at all

Perfectly

Give an example of how you could carry out fixed value constraints on this model conceptually (in terms of the table)

Class 1 0 0 0 0
Class 2 a b c d

You could set the cond. prob to 0 for the first class if you expect that those in class 1 (eg non-conservers) will fail on these items for sure. You could also set this to a guessing value (e.g 0.5; 0.25)

How well did you know this?

Not at all

Perfectly

How would you set up a fixed value constrained model in R?

To tell the programme that 0 should be fixed to 0 we need another matrix

restr= matrix(c(
0,0,0,0, #Just means parameter is fixed
1,2,3,4),2, 4, T) #Unique values for each item
restr

init = matrix(c(
.5,0,0,0,0,
.5,.8,.8,.8,.8),2, 5, T)
init
#The init matrix contains the class sizes and parameter values for both class one and two
#If it is set to 0, the programme will not attempt to estimate them, the .8 values are arbitrary however; the MLE will begin at .8 and move from there to the estimated values. Values of .5 are usually safe

res_fix = lca(dat, 2, restr = restr, init = init)
res_fix #to get the results, conditional probabilities in class 1 should be set to 0 and those in class 2 should vary

How well did you know this?

Not at all

Perfectly

How many parameters are in the constrained and unconstrained model? (fixed value constraint)

No constraints:
Npar = 9
(BIC = 794.87
LL = -373.60)

Fixed value constraint:
Npar = 5
(BIC = 870.05
LL = -421.78)

Seems to indicate that the constraints make the model worse

How well did you know this?

Not at all

Perfectly

Give an example of how you could carry out a linear constraint on this model conceptually (in terms of the table)

Class 1 1-a 1-b 1-c 1-d
Class 2 a b c d

In this case the values of class one are specified at the proportion of errors in class two per item

How well did you know this?

Not at all

Perfectly

How would you set up a linear constrained model in R?

restr= matrix(c(
-1,-2,-3,-4, #Just means parameter is fixed
1,2,3,4),2, 4, T) #Unique values for each item
restr

res_lin = lca(dat, 2, restr = restr)
res_lin #to get the results, conditional probabilities of class 1 should be equal to 1 - the conditional probabilities of class 2

The parameters labelled with a number and a minus in front of it (e.g -2) are constraint to be equal to 1 - the parameter with the same number without the minus (e.g 2)

How well did you know this?

Not at all

Perfectly

How many parameters are in the linearly constrained model and the unconstrained model?

No constraints:
Npar = 9
(BIC = 794.87
LL = -373.60)

Linear constraints:
Npar = 5
(BIC = 778.42
LL = -375.96)

This suggested that the constraints improved the model fit

How well did you know this?

Not at all

Perfectly

How can you mix up these constraints?

For example a linear and equality constraint could look like:
Class 1 1-a 1-a 1-a 1-a
Class 2 a a a a

What types of error rates do the linear constraint model and the linear & equality constraint model have?

linear constraint model: Item specific error rate (error rate varies per item)

linear & equality constraint model: Item fixed error rate (error rate varies per item)

How would you set up a linear & equality constrained model in R?

The programme might flip the labels of the classes so be careful

restr= matrix(c(
-1,-1,-1,-1, #Just means parameter is fixed
1,1,1,1),2, 4, T) #Unique values for each item
restr

res_lin = lca(dat, 2, restr = restr)
res_lin

How many parameters do the unconstrained, linear constrained and liear and equality constrained models hold?

No constraints:
Npar = 9

Linear constrains:
Npar = 5

Linear and equality constraints:
Npar = 2

If the following statistics are observed, what can you conclude?

No constraints:
Npar = 9
(BIC = 794.87
-2LL = -373.60)

Linear constrains:
Npar = 5
(BIC = 778.42
-2LL = -375.96)

Linear and equality constraints:
Npar = 2
(BIC = 767.51
-2LL = -378.46)

No constraints:
Npar = 9
(BIC = 794.87
-2LL = -373.60)

Linear constrains:
Npar = 5
(BIC = 778.42
-2LL = -375.96)

Linear and equality constraints:
Npar = 2
(BIC = 767.51
-2LL = -378.46)

The BIC indicates that the L&E model is the best fit given the number of parameters. The likelihood ratio test specifically tests for the addition of the extra constraint. In this case it is insignificant and therefore the L&E model is preferred.

If the following statistics are observed, what can you conclude?

No constraints:
Npar = 9
(BIC = 794.87
-2LL = -373.60)

Linear constrains:
Npar = 5
(BIC = 778.42
-2LL = -375.96)

Linear and equality constraints:
Npar = 2
(BIC = 767.51
-2LL = -378.46)

No constraints:
Npar = 9
(BIC = 794.87
-2LL = -373.60)

Linear constrains:
Npar = 5
(BIC = 778.42
-2LL = -375.96)

Linear and equality constraints:
Npar = 2
(BIC = 767.51
-2LL = -378.46)

WHat is classification concerned with in LCA?

Given the parameter estimates of you LCA model, you can estimate class membership.

Give two possible notations for posterior class probabilities

Posterior class probabilities
Probability that someone with responses Xp1, ..., Xpn belongs to a class.
𝑃 (𝜃𝑝 =1 |𝑋𝑝1,...,𝑋𝑝𝑛)
𝑃 (𝜃𝑝 =2 |𝑋𝑝1,...,𝑋𝑝𝑛)
or
𝜋|𝑋|𝐴𝐵𝐶𝐷, 𝑡𝑖𝑗𝑘𝑙|

aka probability they belong to a class given a certain output pattern

How to you calculate the posterior probability in classification

According to Bayes rule:
𝑃 (𝐵| 𝐴) =𝑃 (𝐴&𝐵) / 𝑃(𝐴)
=𝑃 (𝐴|𝐵) ×𝑃(𝐵) / 𝑃(𝐴)

Here B is 𝜃𝑝 and A is a given response pattern (e.g, Xp=[1 0 0 1 1]), i.e.,

𝑃 (𝜃𝑝 =1 | [10011] ) = 𝑃 (𝑋𝑝= [10011] |𝜃𝑝=1) ×𝑃(𝜃𝑝=1) / 𝑃(𝑋=[10011])
𝑃 (𝜃𝑝 =2 | [10011] ) = 𝑃 (𝑋𝑝= [10011] |𝜃𝑝=1) ×𝑃(𝜃𝑝=2) / 𝑃(𝑋=[10011])

You know how to calculate both numerator and denominator already, they depend on the conditional probabilities and class probabilities!

For what kid of variables is latet profile analysis?

Categorical latent variables with continuous observed variables

Give an example of when you could have categorical latent variables with continuous observed variables

In the conservation task, you can either be right or wrong about the level of the water after it is transferred to a wider beaker. If you pose the question by asking children to mark a line where the surface of the water will be, however, you will get a continuous observed variable of the height of the line.

How can you derive parameters for each item?

You could form two distributions of the height in which you expect conservers and non conservers to draw the line (shown in docs). You can then take the mean of item one in class one (𝜇11 =1.5) and the mean of item 1 in class 2 (𝜇21 =4.2), the std deviations of these distributions respectively (𝜎^2 11 =0.8, 𝜎^2 21 =0.9) with the class probabilities, such as the probabilityy of being in class 2 (𝜋2 =0.05). You can do this for each item to obtain the parameters.

Give and explain the equation they give in the chapter for LPA

𝑓(𝒚𝑖| 𝜃) = E|𝐾, 𝑘=1 | ( 𝜋𝑘𝑓𝑘(𝒚𝑖|𝜃𝑘) ) where 𝒚𝑖: the response vector of subject i, e.g., [2.1 3.7 0.9 1.1] 𝑓𝑘(.): the normal distribution in class k 𝜃𝑘:the parameters (means and (co)variances) in class k 𝜋𝑘:class probability

What assumption is there about the items in each class in LPA?

In latent profile analysis the items are uncorrelated within each class (local independence). i.e the error matrices have their variances along the diagonal and no covariances

What is this assumption of local independence is not satisfied in LPA?

this restriction can easily be relaxed (latent class cluster analyses)

What is involved in the identification of a LPA model?

• Scaling the latent variable by fixing the number of classes * Verifying statistical identification is tricky in these models * But if you use not too many classes you are fine * Can be better to fit the model and assess the output to see if there are very large variances, very small means etc; hard to spot before fitting the model

What options are there for parameter analysis in LPA? (2)

* Maximum likelihood * Commonly by Expectation Maximization algorithm • Bayesian estimation (but not very popular yet)