# Stats & Research Design Questions Flashcards

Cluster sampling involves

A. randomly selecting individual subjects from a larger target population.

B. randomly selecting a naturally-occurring group of subjects from a larger target population.

C. randomly selecting several naturally occurring groups from a larger population, and then randomly selecting individuals from each group.

D. randomly selecting individuals from a larger target population and dividing subjects into groups on the basis of their status on a demographic variable.

B. randomly selecting a naturally-occurring group of subjects from a larger target population.

In cluster sampling, naturally occurring groups of subjects, rather than individual subjects, are randomly selected for participation in research. For instance, if a researcher wants to use elementary school students in an educational study, he or she could randomly choose a school from the schools in his state, and use all the students in that school as participants in the study. That’s cluster sampling. A variation of cluster sampling, known as multistage cluster sampling, involves selecting a large cluster (group) and then selecting selectively smaller clusters. For example, the researcher studying elementary school students could randomly select a school district, then randomly select a school from the chosen school district, and then randomly select a classroom from the chosen school. In this case, both forms of cluster sampling would be more practical than the alternative of simple random sampling. That would involve randomly selecting individual elementary school students from across the state.

The risk of sampling error is greatest when a: A. sample size is small B. test has low reliability C. test has low validity D. confounding variable exists

A. sample size is small

Sampling error is the extent to which a sample value deviates from the corresponding population value which it is supposed to represent. Thus, the smaller the sample size, the greater the risk of sampling error. You should have been able to eliminate reliability (“B”) and validity (“C”), since those are characteristics of a test – which is not applied until after the sampling procedure. Sampling error, as it’s name implies, takes place during the sampling or selection of subjects. A confounding variable (“D”), is a variable that is not of interest in a study but which exerts a systematic effect on the DV. Thus it would threaten the internal validity of a test but it is not related to sampling error.

In a positively skewed distribution, one would most likely find, ranked from lowest to highest in value, the: A. median, mean, mode. B. median, mode, mean. C. mean, mode, median. D. mode, median, mean.

D. mode, median, mean.

You have to picture the positively skewed curve in order to get this correct. Positive skewness means there are some outliers (extreme scores) way over on the positive side. That’s where the tail is, way off to the right, or positive, end. Since the mean takes into account the magnitude of the scores, these outliers can be pictured as “pulling” the mean to the positive side, or the right. So, in any ordering of measures of central tendency, the mean would be the highest value. Thus, you can eliminate the two distractors that don’t list the mean as the highest value. To distinguish between the remaining answers, let’s go back to consider what the median is. The median is the middlemost point irrespective of value. If you’ve pictured the curve correctly you can see that more than half the cases fall on the right side because some are way over on the positive side. If you put a line where the highest point is on the curve, which is the mode, you’d see that more than half the cases fall to the right of that line. Hence the median, the 50% point, is to the right of the high point, the mode. This should have gotten you to the correct answer.

Which of the following techniques would be most useful for identifying subgroups of patients with Major Depressive Disorder based on their unique pattern of symptoms? A. multiple regression B. LISREL C. logistic regression D. cluster analysis

D. cluster analysis - Correct - Cluster analysis is used to categorize individuals or objects into subgroups (clusters) based on their similarities – e.g., for identifying different types of depression based on patterns of symptoms.

(AATBS Sample Question)