Introduction to Psych Stats Flashcards

1
Q

Describe the difference between descriptive and inferential statistics.

A

Descriptive statistics focus on summarizing and organizing data through measures such as mean, median, mode, and standard deviation, providing a clear picture of the dataset. In contrast, inferential statistics use sample data to make generalizations or predictions about a larger population, employing techniques like t-tests and ANOVA to draw conclusions and assess relationships between variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain the four scales of measurement in statistics.

A

The four scales of measurement are nominal, ordinal, interval, and ratio. Nominal scales categorize data without a specific order (e.g., gender, colors). Ordinal scales rank data in a meaningful order but without consistent intervals (e.g., satisfaction ratings). Interval scales have equal intervals between values but lack a true zero (e.g., temperature in Celsius). Ratio scales possess all the properties of interval scales, plus a true zero point, allowing for meaningful comparisons (e.g., weight, height).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define the characteristics of a ratio scale.

A

A ratio scale is a quantitative measurement scale that possesses two key characteristics: equal intervals and an absolute zero point. Equal intervals mean that the difference between values is consistent across the scale, allowing for meaningful comparisons. The absolute zero point indicates the absence of the quantity being measured, enabling the calculation of ratios, such as twice as much or half as much, which is not possible with other scales.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Do you understand the roles of independent and dependent variables in research?

A

In research, the independent variable (IV) is the factor that the researcher manipulates or controls to observe its effect on another variable. The dependent variable (DV) is the outcome that is measured to assess the impact of the IV. Understanding the relationship between these variables is crucial for establishing cause-and-effect conclusions in experimental studies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain what a confounding variable is and its impact on research results.

A

A confounding variable is an extraneous factor that can influence the dependent variable, potentially leading to misleading conclusions. It introduces alternative explanations for the observed effects, making it difficult to determine whether the independent variable truly caused the changes in the dependent variable. Identifying and controlling for confounding variables is essential to ensure the validity and reliability of research findings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe the three measures of central tendency and their significance.

A

The three measures of central tendency are mean, median, and mode. The mean is the average of all data points, providing a general sense of the dataset. The median is the middle value when data is ordered, offering a measure less affected by outliers. The mode is the most frequently occurring value, highlighting common trends. Together, these measures provide a comprehensive understanding of the data’s central location.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is standard deviation and why is it important in statistics?

A

Standard deviation is a statistical measure that quantifies the amount of variation or dispersion in a set of data points relative to the mean. A low standard deviation indicates that the data points are close to the mean, while a high standard deviation signifies greater variability. It is crucial for understanding the spread of data, assessing the reliability of the mean, and making comparisons between different datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define variance and its role in statistical analysis.

A

Variance is a statistical measure that represents the average of the squared deviations from the mean. It quantifies the degree of spread in a dataset, indicating how much individual data points differ from the mean. Variance is essential in statistical analysis as it forms the basis for calculating standard deviation and is used in various inferential statistics methods, helping researchers understand data variability and make informed conclusions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain the concept of a z-score and its significance in statistics.

A

A z-score is a standardized score that indicates how many standard deviations a particular value is from the mean of a dataset. It allows for the comparison of scores from different distributions by converting them to a common scale. Z-scores are significant in identifying outliers, understanding the relative position of a score within a distribution, and facilitating the application of statistical tests that assume normality.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe the interquartile range and its importance in data analysis.

A

The interquartile range (IQR) is a measure of statistical dispersion that represents the range of the middle 50% of a dataset, calculated as the difference between the third quartile (Q3) and the first quartile (Q1). It is important because it provides a robust measure of variability that is less affected by outliers than the overall range. The IQR is commonly used in box plots and helps in understanding the spread and central tendency of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Explain the 68-95-99.7 rule in the context of normal distribution.

A

The 68-95-99.7 rule, also known as the empirical rule, describes how data is distributed in a normal distribution. According to this rule, approximately 68% of data points fall within one standard deviation of the mean, about 95% fall within two standard deviations, and around 99.7% fall within three standard deviations. This rule is crucial for understanding the spread of data and making predictions about probabilities in normally distributed datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe the Central Limit Theorem and its implications in statistics.

A

The Central Limit Theorem (CLT) states that as the sample size increases, the sampling distribution of the sample mean approaches a normal distribution, regardless of the original population’s distribution. This theorem is fundamental in statistics because it allows researchers to make inferences about population parameters using sample statistics, enabling the application of various statistical methods and hypothesis testing, even when the underlying data is not normally distributed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe a sampling distribution.

A

A sampling distribution is a statistical concept that represents the distribution of a particular statistic, such as the mean or proportion, calculated from multiple samples drawn from the same population. It illustrates how the statistic varies from sample to sample, providing insights into the reliability and variability of the estimate. The shape of the sampling distribution can often be approximated by a normal distribution, especially as the sample size increases, due to the Central Limit Theorem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Define a null hypothesis (H₀).

A

The null hypothesis, denoted as H₀, is a fundamental concept in statistical hypothesis testing. It posits that there is no significant effect or difference between groups or conditions being studied. Essentially, it serves as a default position that assumes any observed effect in the data is due to random chance rather than a true underlying effect. Researchers aim to gather evidence to either reject or fail to reject the null hypothesis based on statistical analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Explain what a Type I error is.

A

A Type I error occurs in hypothesis testing when a true null hypothesis is incorrectly rejected, leading to a false positive conclusion. This means that the test suggests there is an effect or difference when, in reality, none exists. The probability of making a Type I error is denoted by the alpha level (α), commonly set at 0.05. This error can have significant implications, particularly in fields like medicine or social sciences, where false claims of effectiveness can lead to inappropriate actions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Describe a Type II error.

A

A Type II error happens when a false null hypothesis is not rejected, resulting in a false negative conclusion. In this scenario, the test fails to detect an actual effect or difference that exists in the population. The probability of making a Type II error is represented by beta (β). This type of error can be particularly problematic in research, as it may lead to the incorrect assumption that a treatment or intervention is ineffective when it actually has a significant impact.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does statistical power refer to?

A

Statistical power is the probability that a statistical test will correctly reject a false null hypothesis, thereby detecting a true effect when it exists. It is calculated as 1 minus the probability of a Type II error (1 - β). High statistical power is desirable in research, as it increases the likelihood of identifying significant results. Factors influencing power include sample size, effect size, and the significance level set for the test. A power of 0.80 is often considered acceptable, indicating an 80% chance of detecting an effect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Explain the significance of a p-value.

A

A p-value is a statistical measure that helps determine the strength of the evidence against the null hypothesis. It quantifies the probability of obtaining the observed results, or more extreme results, assuming that the null hypothesis is true. A low p-value (typically less than 0.05) suggests that the observed data is unlikely under the null hypothesis, leading researchers to consider rejecting H₀. However, p-values do not measure the size of an effect or the importance of a result, and they should be interpreted in the context of the study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Define a confidence interval.

A

A confidence interval is a range of values, derived from sample data, that is likely to contain the true population parameter with a specified level of confidence, usually 95% or 99%. It provides an estimate of uncertainty around a sample statistic, such as the mean or proportion. For example, a 95% confidence interval suggests that if the same sampling procedure were repeated multiple times, approximately 95% of the calculated intervals would contain the true parameter. Confidence intervals are crucial for understanding the precision and reliability of statistical estimates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Describe Cohen’s d and its purpose.

A

Cohen’s d is a measure of effect size that quantifies the standardized difference between two means. It is calculated by taking the difference between the means of two groups and dividing it by the pooled standard deviation. This metric helps researchers understand the magnitude of an effect, providing context beyond mere statistical significance. Cohen’s d values can be interpreted as small (0.2), medium (0.5), or large (0.8), aiding in the assessment of practical significance in research findings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What does r² represent in statistical analysis?

A

R-squared (r²) is a statistical measure that indicates the proportion of variance in the dependent variable that can be explained by the independent variable(s) in a regression model. It ranges from 0 to 1, where 0 means that the independent variable does not explain any of the variability of the dependent variable, and 1 means it explains all the variability. R-squared is useful for assessing the goodness of fit of a model, helping researchers understand how well their model captures the underlying data patterns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Explain the purpose of effect size in research.

A

Effect size is a quantitative measure that assesses the magnitude of a phenomenon or the strength of a relationship in research findings. Unlike p-values, which only indicate statistical significance, effect size provides insight into the practical significance of results. It helps researchers understand the real-world implications of their findings, allowing for better comparisons across studies. Common measures of effect size include Cohen’s d, Pearson’s r, and odds ratios, each serving to contextualize the importance of the observed effects.

23
Q

Describe when an independent t-test is used.

A

An independent t-test is a statistical method used to compare the means of two unrelated groups to determine if there is a significant difference between them. This test is appropriate when the samples are independent, meaning that the participants in one group are not related to those in the other group. It assumes that the data is normally distributed and that the variances of the two groups are equal. The independent t-test is commonly used in experimental and observational studies to evaluate the effects of interventions or treatments.

24
Q

Explain when a paired t-test is appropriate.

A

A paired t-test is used when comparing the means of two related groups, typically measuring the same subjects under two different conditions or at two different times. This test accounts for the natural pairing of observations, which helps control for individual variability. It is commonly applied in pre-test/post-test designs, where researchers assess the impact of an intervention by measuring outcomes before and after the treatment. The paired t-test assumes that the differences between paired observations are normally distributed.

25
Describe the purpose of ANOVA in statistical analysis.
ANOVA, or Analysis of Variance, is a statistical technique used to test for significant differences among the means of three or more groups. It helps determine whether at least one group mean is different from the others, which is essential in experimental designs where multiple treatments or conditions are compared. ANOVA assesses the variance within groups and between groups, providing a comprehensive view of the data. If significant differences are found, post-hoc tests can be conducted to identify which specific groups differ from each other.
26
Explain what a repeated measures ANOVA is.
A repeated measures ANOVA is a statistical method used to analyze data from within-subjects designs, where the same subjects are measured multiple times under different conditions or over time. This approach accounts for the correlation between repeated observations, enhancing the sensitivity of the analysis. It is particularly useful in longitudinal studies or experiments where participants undergo multiple treatments. By examining the variance in scores across different conditions, repeated measures ANOVA helps researchers understand how factors influence outcomes over time.
27
Describe the purpose of post-hoc tests in statistical analysis.
Post-hoc tests are utilized after conducting an ANOVA (Analysis of Variance) to identify which specific group means are significantly different from each other. When an ANOVA indicates that there are differences among group means, post-hoc tests help to pinpoint the exact pairs of groups that differ, thus providing a clearer understanding of the data and the relationships between the groups.
28
Explain the function of a chi-square test in research.
A chi-square test is employed to assess the relationships between categorical variables. It evaluates whether the observed frequencies in a contingency table differ significantly from the expected frequencies under the assumption of independence. This test is particularly useful in determining if there is an association between two categorical variables, such as gender and preference, allowing researchers to draw conclusions about the data.
29
Define Pearson’s r and its significance in statistics.
Pearson’s r is a statistical measure that quantifies the strength and direction of a linear relationship between two continuous variables. Ranges from -1 to 1, where values close to 1 indicate a strong positive correlation, values close to -1 indicate a strong negative correlation, and values around 0 suggest no linear relationship. This measure is crucial for understanding how changes in one variable may predict changes in another.
30
How does Spearman’s rho differ from Pearson’s r?
Spearman’s rho is a non-parametric measure of correlation that assesses the strength and direction of the relationship between two variables when the data is ordinal or not normally distributed. Unlike Pearson’s r, which requires interval data and assumes a linear relationship, Spearman’s rho ranks the data and evaluates how well the relationship between the variables can be described using a monotonic function, making it more versatile for non-linear data.
31
Explain the meaning of r² in the context of regression analysis.
In regression analysis, r², or the coefficient of determination, represents the proportion of variance in the dependent variable that can be explained by the independent variable(s). It ranges from 0 to 1, where a value of 0 indicates that the independent variable does not explain any variance in the dependent variable, while a value of 1 indicates perfect explanation. This metric is essential for assessing the effectiveness of a regression model.
32
Describe the concept of simple linear regression.
Simple linear regression is a statistical method used to predict the value of a dependent variable (DV) based on the value of a single independent variable (IV). It establishes a linear relationship between the two variables by fitting a straight line to the data points, allowing for predictions and insights into how changes in the IV affect the DV. This technique is foundational in statistics for understanding relationships in data.
33
What is multiple regression and its application?
Multiple regression is a statistical technique that predicts the value of a dependent variable (DV) using two or more independent variables (IVs). This method allows researchers to assess the impact of multiple factors simultaneously, providing a more comprehensive understanding of the relationships within the data. It is widely used in various fields, including social sciences, economics, and health research, to model complex phenomena.
34
Explain the significance of beta weights (β) in regression analysis.
Beta weights (β) in regression analysis indicate the strength and direction of the relationship between each independent variable and the dependent variable, expressed in standardized units. A higher absolute value of β signifies a stronger influence of that predictor on the DV. These weights help researchers understand the relative importance of each IV in predicting the DV, guiding decision-making and interpretation of results.
35
Identify and describe three key assumptions of linear regression.
Three key assumptions of linear regression are linearity, normality, and homoscedasticity. Linearity assumes that the relationship between the independent and dependent variables is linear. Normality requires that the residuals (errors) of the model are normally distributed. Homoscedasticity means that the variance of the residuals is constant across all levels of the independent variable. Violating these assumptions can lead to inaccurate results.
36
Define internal consistency and its measurement.
Internal consistency refers to the degree to which test items measure the same underlying construct. It is commonly assessed using Cronbach’s alpha, a statistic that ranges from 0 to 1, where higher values indicate greater consistency among items. A high internal consistency suggests that the items are reliably measuring the same concept, which is crucial for the validity of psychological tests and surveys.
37
Describe the concept of test-retest reliability.
Test-retest reliability is a measure of the stability of test scores over time. It assesses whether a test yields consistent results when administered to the same group of individuals on two different occasions. High test-retest reliability indicates that the test produces stable and reliable scores, which is essential for ensuring that the measurement is not influenced by external factors or random error, thus enhancing the credibility of the findings.
38
Explain inter-rater reliability and its importance.
Inter-rater reliability refers to the degree of agreement or consistency between different observers or raters when assessing the same phenomenon. It is crucial in research where subjective judgments are involved, such as in behavioral assessments or qualitative studies. High inter-rater reliability ensures that the results are not dependent on who is conducting the assessment, thereby enhancing the validity and reliability of the data collected.
39
Define content validity and its relevance in testing.
Content validity refers to the extent to which a test measures all aspects of a given construct. It ensures that the test items adequately represent the domain of interest and are relevant to the construct being measured. Establishing content validity is essential for the credibility of psychological tests and assessments, as it helps to ensure that the test accurately reflects the theoretical concept it aims to measure, thus supporting valid interpretations of the results.
40
Describe construct validity and its significance in research.
Construct validity is the degree to which a test accurately reflects the theoretical concept it is intended to measure. It encompasses both convergent and discriminant validity, ensuring that the test correlates with measures of related constructs and does not correlate with unrelated ones. Establishing construct validity is vital for the credibility of research findings, as it confirms that the test is truly measuring the intended construct rather than extraneous factors.
41
Describe convergent validity.
Convergent validity refers to the degree to which two measures that are supposed to be measuring the same construct correlate with each other. It is an essential aspect of construct validity, indicating that a test is accurately capturing the intended concept. For example, if a new intelligence test correlates highly with established intelligence measures, it demonstrates convergent validity, suggesting that both tests are assessing similar cognitive abilities.
42
Explain discriminant validity.
Discriminant validity is a measure of how well a test distinguishes between different constructs. It is crucial for establishing construct validity, ensuring that a test does not correlate with unrelated variables. For instance, if a test designed to measure anxiety shows low correlation with a test measuring unrelated constructs like physical strength, it demonstrates good discriminant validity, confirming that the tests are assessing different psychological dimensions.
43
Define criterion validity.
Criterion validity assesses how well one measure predicts an outcome based on another measure, often referred to as the criterion. This type of validity can be concurrent, where both measures are taken at the same time, or predictive, where one measure forecasts future performance on another. For example, if a new depression scale accurately predicts patients' responses on a well-established clinical assessment, it exhibits strong criterion validity, indicating its effectiveness in real-world applications.
44
Describe exploratory factor analysis (EFA).
Exploratory factor analysis (EFA) is a statistical technique used to identify the underlying relationships between variables in a dataset. It helps researchers uncover latent constructs that may not be directly observable. By analyzing patterns of correlations among variables, EFA can reveal how many factors exist and which variables load onto each factor. This method is particularly useful in the early stages of research when the structure of the data is not well understood, guiding further analysis and hypothesis development.
45
Explain confirmatory factor analysis (CFA).
Confirmatory factor analysis (CFA) is a statistical method used to test whether a hypothesized factor structure fits the observed data. Unlike exploratory factor analysis, which seeks to discover potential structures, CFA is used to confirm or reject a predefined model based on theoretical expectations. Researchers specify the number of factors and the relationships between observed variables and factors, allowing for rigorous testing of the model's validity. CFA is essential in validating measurement instruments and ensuring they accurately reflect the constructs they are intended to measure.
46
Describe when nonparametric tests are used.
Nonparametric tests are employed when the assumptions required for parametric tests, such as normality and homogeneity of variance, are violated. These tests do not assume a specific distribution for the data, making them suitable for ordinal data or when sample sizes are small. They are particularly useful in situations where the data do not meet the stringent requirements of parametric tests, allowing researchers to analyze data without compromising the integrity of their findings. Examples include the Mann-Whitney U test and the Kruskal-Wallis test.
47
Define the Mann-Whitney U test.
The Mann-Whitney U test is a nonparametric statistical test that serves as an alternative to the independent t-test. It is used to compare differences between two independent groups when the data do not meet the assumptions of normality. The test ranks all observations from both groups and assesses whether the ranks differ significantly between the two groups. This method is particularly useful for ordinal data or when sample sizes are unequal, providing a robust way to evaluate group differences without relying on parametric assumptions.
48
Explain the Wilcoxon signed-rank test.
The Wilcoxon signed-rank test is a nonparametric alternative to the paired t-test, used to compare two related samples or matched observations. This test assesses whether the median difference between pairs of observations is significantly different from zero. It ranks the absolute differences between pairs, considering the direction of the differences, and evaluates the sum of the ranks. This method is particularly useful when the data do not meet the assumptions of normality, making it a reliable choice for analyzing paired data in various research contexts.
49
Describe the Kruskal-Wallis test.
The Kruskal-Wallis test is a nonparametric statistical method used to compare three or more independent groups. It serves as an alternative to one-way ANOVA when the assumptions of normality and homogeneity of variance are not met. The test ranks all observations across groups and evaluates whether the ranks differ significantly among the groups. By determining if at least one group has a different distribution, the Kruskal-Wallis test provides insights into group differences without relying on parametric assumptions, making it suitable for ordinal or non-normally distributed data.
50
Explain the Friedman test.
The Friedman test is a nonparametric statistical test used to detect differences in treatments across multiple test attempts. It serves as an alternative to repeated measures ANOVA when the assumptions of normality are violated. The test ranks the data for each subject across different conditions and assesses whether the ranks differ significantly. By evaluating the differences in ranks, the Friedman test helps researchers determine if at least one treatment condition leads to a different outcome, making it valuable for analyzing repeated measures data in various fields.
51
Describe the APA format for reporting a t-test.
"When reporting the results of a t-test in APA format, the standard structure includes the t statistic, degrees of freedom (df), and the p-value. The format is typically presented as *t*(df) = value, *p* = .xxx. This concise reporting allows readers to quickly understand the statistical significance of the results. For example, a report might state, "*t*(28) = 2.45
52
Explain the importance of reporting effect sizes.
Reporting effect sizes is crucial in research as it provides a measure of the practical significance of findings beyond mere statistical significance indicated by p-values. Effect sizes quantify the magnitude of an effect, allowing researchers to understand the real-world implications of their results. For instance, a small p-value may indicate statistical significance, but without an effect size, it is difficult to gauge how meaningful that difference is in practical terms. Effect sizes facilitate comparisons across studies and contribute to a more comprehensive understanding of the data.
53
Describe what should be reported before inferential statistics.
Before conducting inferential statistics, it is essential to report descriptive statistics, which summarize the basic features of the data. This includes measures such as means, standard deviations, and ranges, providing a clear overview of the dataset's characteristics. Descriptive statistics help contextualize the inferential analyses by offering insights into the distribution and variability of the data. By presenting these statistics first, researchers lay a foundation for understanding the results of inferential tests, enhancing the clarity and interpretability of the findings.