Flashcards in Research Deck (82)
What are the benefits of using a one-way ANOVA?
An advantage of the one-way ANOVA (or any ANOVA) is that it helps control the "experimentwise error rate" (i.e., the probability of making a Type I error). If alpha is set at .05 for this study, for instance, the probability of making a Type I error would be help at 5%. In contrast, if the individual t-tests were conducted, each at the .05 level, the probability of making a Type I error would be much higher.
What will allow you to control the effects of an extraneous variable?
The analysis of covariance is useful for this purpose.
Will adding 12 points increase the mean and other measures of central tendency by 12 points? Will this have any effect on the standard deviation or other measures of variability.
No effect on SD or any other measure of variability
1. The _____ product moment correlation coefficient is the appropriate correlation coefficient when both variables are measured on a continuous scale.
2. The __________ is used to correlate two variables that are measured in terms of ranks.
The _______ is used to determine the correlation between two dichotomous variables.
The ___________ is appropriate when one variable is continuous and the other is a true dichotomy.
Spearman rho (rho = Ranks) Correlation between height and shoe size
phi coefficient (living or dead)
point biserial correlation coefficient (time and dead/alive)
Decreasing the level of significance from .05 to .01 makes it more difficult to reject the null hypothesis and, therefore, also _______ power.
Decreasing the susceptibility of the dependent variable measure to measurement (random) error would ________power by ensuring that the measure is able to accurately detect the effects of the independent variable.
When appropriately used, a one-tailed test is ______powerful than a two-tailed test since it places the entire rejection region in only one tail of the sampling distribution rather than splitting it up between the two tails.
What are some bivariate techniques? And why are they used?
What bivariate summarizes the degree of association between variables with a single number?
How do you determine the amount of shared variability or coefficient of determination?
When should a bivariate coefficient be squared?
When is the correlation NOT squared? and should be interpreted as a ?
What are some correlation Coefficients? and what are their variables?
Which is used when a relationship between variables is non-linear?
Bivariate techniques describe or summarize the degree of association between two variables and include:
The correlation coefficient is squared to determine the amount of shared variability/ coefficient of determination: in other words- the Squared correlation coefficient indicates the proportion of variability in Y that is explained or accounted for by the variability in X
A Bivariate Correlation Coefficient should be squared to obtain a measure of shared variability Only when it indicates the degree of freedom between two variables.
A correlation coefficient is not squared when it is a reliability coefficient (reality) which is a correlation of a measure with it's self.
-direct measure of "true score variability"
-Pearson r (interval or ratio: interval or ratio)
-Spearman rho (Rank-order; Rank ordered)
-Contingency (nominal, Nominal)
Point Biserial (True Dichotomy; interval or ratio)
Biserial ( Artificial Dichotomy; interval or ratio)
A factorial research design is any design that includes ________ "factors" (independent variables)?
two or more
Pearson r and other correlation coefficient range in value from ____. The magnitude of the coefficient indicates the relationship's strength the sign (+ or -) indicates the relationships __________. Where a _____ means the value of Y increases as the value of X increases. A _______ correlation, the value of Y decreases as the value of X __________.
-1.0 to + 1.0
What are the three assumptions that must be met in order to produce an accurate correlation?
_Unrestricted range: where data is collected from people who are heterogeneous with regard to the characteristics measured by X and Y
________ is used to predict Y from a single X, and an assumption is that the relationship between X and Y is ______ (i.e., the relationship can be described by a straight line (regression line "line of best fit"). What technique is used to locate a regression line in a scatterplot?
Least squares criterion-which locates the line so that the amount of error in prediction is minimized.
Statistical power refers to the ability to reject a false hypothesis and is effected by the size of the _____-.
Therefore when a statistical test enables an investigator to reject a false null hypothesis it is said to have statistical power:
What are the methods to increase statistical power?
-Decreasing the susceptibility of the dependent variable measure to measurement (random) error would increase power by ensuring that the measure is able to accurately detect the effects of the independent variable. (minimize error)
- Increase Alpha to .05 over .01.
-When appropriately used, a one-tailed test is more powerful than a two-tailed test since it places the entire rejection region in only one tail of the sampling distribution rather than splitting it up between the two tails.
-Maximizing differences between treatment groups does help increase power.
-use a parametric test
-increase sample size
The value of the Pearson-product moment correlation coefficient ranges from -1.0 to +1.0, and the corresponding proportion of variance (which is referred to as the coefficient of determination) is computed by ______ the correlation coefficient.
In addition to the Pearson r, what are some other coefficients?
What determines statistical significance?
Spearman rho (rank order/..)
Phi (Tru Dic/ Tru dic.)
Tetrachoric (Art. Dic/Art. Dic)
Point Biserial (True Dic,/interval or ratio)
Biserial (art. Dic/ interval or ratio)
Eta (interval or ratio/,,,)
-compare coeff. to appropriate critical value which is determined by Alpa and sample size.
Small Sample; need a large coefficient
How do you interpret a Correlation Coefficient?
-Directly in terms of degree of association (-1 or +1). Correlations are not causal but
-When a correlation represents the degree of association between two different variables, it can be squared to represent a coefficient of determination which provides a measure of shared variability.
What does a regression (Regression Analysis, Multiple Regression) do?
Name some types of multiple regressions.
When is a analysis of variance used? What is the advantage of using a ANOVA?
When is an ANOVA better than a t-test?\
Name some other ANOVAs
What analysis of variance is used for a within subjects design when different levels of the IV or combination of the levels of two or more IV's are subsequently administered (time) to each subject.
How is an F ratio caluculated using a one way ANOVA?
Multiple regression: is a multivariate technique that is used when two or more continuous or discrete predictors will be used to predict status on a single continuous criterion.
ANOVA: to compared two or more means:
Advantage: it compares two group means while holding the probability of making a type 1 error at the level of significance set by the investigator. -- helps to control experimentwise error rate>
t test and ANOVA are comparable when comparing two means, but an ANOVA is the stat. of choice when 2 or more means are compared. Further, an ANOVA is more complex and analyzes the variability around the mean.
Factorial ANOVA- 3 IV
Randomized Block ANOVA: treats extraneous variable as a IV
-Analysis of Covariance: ANCOVA: combines analysis of -variance and regression analysis: removes extraneous variables from the DV
-Repeated measure ANOVA: within subject design Levels ect. of IV is administered to each subject
-Mixed (Split-Plot) ANOVA: mixed design - at least one IV is between-groups and other within-subjects
Trend Analysis: evaluate the shape or form of the relationship --statistically significant linear or non-linear
-Multivariate Analysis of Variance (MANOVA) : 1+ IV and 1+DV
-Repeated measures ANOVA
An F-ratio is calculated.. When using the one-way ANOVA to determine if an independent variable has had a significant effect on a dependent variable, an F-ratio is calculated by dividing the mean square between (MSB) by the mean square within (MSW).
Divide MSB by MSW (manage social behavior by a Masters of Social work)
--MSB provides an estimate of variability due to (treatment plus error), while MSW provides an estimate of variability due to (error only.) When the independent variable has had an effect, MSB will be larger than MSW and the F-ratio will be larger than 1.0, and the larger the F-ratio, the more likely that the effect of the independent variable is statistically significant.
( the bigger the behavior MSB the larger the F-ratio and the more statistically significant the "issue")
Cohen's d and eta squared are commonly used for?
Which is used to measure the difference between two groups( experimental and control) in terms of SD.
And which indicates the percent of variance in the outcome variable that is accounted for by variance in the treatment?
How should significant main and interaction effects be interpreted when using a factorial ANOVA to assess the effects of two independent variables on a dependent variable?
What type of design is used when the effects of different levels of an IV are assessed by administering each level to a different group of subjects and comparing the status or performance of the groups on the DV.
What design : all levels of the IV are administered sequentially to all subjects?
Which design combines between groups and within subjects by including at least one between-groups IV and one within-subjects IV and involve measuring the DV across trial/time where trail/time is an additional IV and is within-subjects?
What is main and interaction effect?
Main effects refer to the effects of one independent variable on the dependent variable when considered alone, while interaction effects refer to the effects of one independent variable at different levels of another independent variable.
When the interaction is significant, this means that the effects of one independent variable differ for different levels of another independent variable. Thus, it is not possible to conclude that the independent variable has consistent main effect. Therefore you need to interpret the main effects with caution since the interaction is significant
Main IV on DV and disregards the effects of all other IV
Interaction: refers to the effects of two or more IV considered together and occurs when the effects of an IV differ at different levels of another IV
Measures of Central Tendency in Skewed Distribution
Positive (Mode, Md. Mean -Mean is the highest)
Negative (Mean, Md. Mo. ) Mode is greater than the md. which is greater then the M)
Name the measures of variability or spread?
How is the variance calculated?
How is the Standard Deviation Calculated?
Variance (M squared)
Standard deviation the is square root of the variance.
a study in which each participant received only one level of Variable A but all levels of Variable B
Split plot or Mixed
What do multivariate techniques do?
These techniques are categorized as Dependence Method and Interdependence methods. What is the difference?
What techniques predicts status on a variable?
What multivariate techniques teat a casual Model or theory?
What multivariate techniques are used for the purpose of data reduction and why is this important
they investigate the relationships among three or more variables. -Multivariate data analysis is a set of statistical models that examine patterns in multidimensional data by considering, at once, several data variables. It is an expansion of bivariate data analysis, which considers only two variables in its models. As multivariate models consider more variables, they can examine more complex phenomena and find data patterns that more accurately represent the real world.
-Dependence has a distinct independent and dependent variables (predictor and criterion)
Interdependence does not and includes several data reduction techniques. dependence and interdependence. Dependence relates to cause-effect situations and tries to see if one set of variables can describe or predict the values of other ones. Interdependence refers to structural intercorrelation and aims to understand the underlying patterns of the data.
-Predictors: Multiple regression (2+ discrete or continuous IV (predictor) and 1 DV (criterion)
-Canonical Correlation ( 2+ IV and 2 DV (criterion)
-Discriminant Function Analysis (2+ continuous predictors and 1 Nominal criterion) A discriminant analysis (also known as discriminant function analysis) involves using scores on two or more predictors to predict an individual's membership in a criterion group - i.e., it is used when the criterion is measured on a nominal scale.
-Logistic Regression: extension of discriminant but assumes relationships are curvilinear
Test a theory: Causal modeling
Consider as an example the regression model — a method to analyze correlations in data. The non-multivariate case of regression is the analysis between two variables, and it is called a __________ regression. It could be used, for instance, to see how the height of a swimmer correlates to its speed. By doing this type of regression, the analyst could find that taller swimmers tend to swim faster. Although it is right, we know that the height is not the only thing influencing speed, so the bivariate model hardly explains the complete phenomena of swimming.
In contrast, a _________ regression — also called multiple regression — could take into account way more variables: weight, age, carbohydrate intake, protein intake, amount of training hours, amount of resting hours, and many others. In theory, the higher the number of variables, the more accurate the regression can represent the phenomena of swimming, to a point where it could pinpoint the speed of a new swimmer with little error.
________ data variables: are always of the numeric type and represent information that can be measured by some scale. Examples include age (20 years), temperature (25 ºC), and profit (US$ 2000). The number specifies the magnitude of the value on a given scale.
_______ variables: categorizes the data, but do not specify its magnitude. Examples include an operational system (Windows, Linux, macOS) and house size (small, medium, large). The list of options that a non-metric variable can assume is called levels or categories. Even when the levels have an inherent order (e.g., a large house is bigger than a small house), it is still a non-metric variable because there is not any magnitude associated (the variable doesn’t say how bigger is the house). Note that non-metric variables can also be numeric when it is not attached to any scale, such as a variable that dictates the id number of objects
Most multivariate techniques perform computations that need numbers as inputs, so how can a technique work with non-metric data?
The answer is that a non-metric variable can become a dichotomic metric variable. In this conversion, each level becomes a new metric variable that can only have the values 0 (as false) or 1 (as true). For instance, consider a non-metric variable that classifies the Color of a product with the levels: black, white, and gray. The variable can be replaced with two new ones: isColorBlack and isColorWhite. If a product is black, then it assumes the values 1 and 0 respectively, and if it is white, the values 0 and 1. There is no need for a variable for gray products because they can assume the values 0 and 0: if they aren’t white nor black, they can only be gray
In dependence techniques of a multivarate tech,, the analyst feeds a model with input data, specifying which variables are independent and which are dependent. The ___________ variables are the ones the model will try to predict or explain (e.g., swimmer speed). The _______ variables (e.g., swimmer height) are the ones the analyst wants to study how much it affects the independent ones.
The goal of all dependence techniques is to establish a cause-effect relationship. The most notable differences between them are the number of independent variables they support and the nature of the variables involved.
What multivariate technique is used to predict sales performance of different stores based on its attributes (e.g., number of vendors, number of hours open). Such analysis would lead to a deeper understanding of what makes each store sell more, which could drive administrative changes in the most important attributes towards values that give higher profit.
Multiple regression is an option when the analyst stipulates only one dependent variable, which is metric. The result of applying a multiple regression is the degree of impact that each independent variable has on the dependent one. That result also leads to an estimation function, where it accepts values for the independent variables and returns the expected value for the dependent.
The classic example is classification. After processing the data, the model can classify future entries that don’t have labels. For instance, a model could analyze characteristics of music fragments (dependent variables), whereas each piece is assigned to a musical genre (independent variable). If the analyst builds a successful model, it can classify the genre of fragments it never saw before. What type of multivariate technique is being used?
(Dependent Variable(s): one non-metric variable
Independent Variables’ Nature: metric)
-Multiple discriminant analysis is very similar to machine learning classifiers. It is an option when there is only one dependent variable, which is non-metric — also called “class” or “label”. The goal is to understand the characteristic of the data that pertain to each class.
-- A discriminant analysis (also known as discriminant function analysis) involves using scores on two or more predictors to predict an individual's membership in a criterion group - i.e., it is used when the criterion is measured on a nominal scale.
A team of aerodynamics engineers who are designing a new aircraft and want to measure if several combinations of engines and wings affect the magnitude of the forces in airplanes (e.g., thrust, drag, lift, weight).
In a simulation environment, the engineers choose three types of engines (E1, E2, E3) and three types of wings (W1, W2, W3) — both the engine type and the wing type are independent variables. They develop several airplanes for all of the engine-wing combinations and launch them in many virtual spaces to collect as much force data as possible (the dependent variables). What type of Multivariate technique would work best in this study? How can the researcher fine tune these results?
MANOVA: require many independent variables and many measures
The application of MANOVA in the collected data could reveal that the combination of E1-W2 is significantly worse, while E3-W1 is significantly better. The engineers can see how each engine, each wing, and each combination, impacts on each of the forces. It is not an easy technique to conduct or to interpret but is a rewarding and powerful one.
-The multiple covariance analysis (MANCOVA) can fine-tune the results and reinforce the study’s validity by removing the effects of possible unobserved variables (for example, whether it was raining or not in the simulations). Thus, even if these factors affect the dependent variables, the MANCOVA reduces its impacts to isolate the effect of the treatments as much as possible.
What is the difference between internal validity, Face validity, Construct validity, external validity?
Internal validity focuses on the causal relationship between independent and dependent variables.
Face validity focuses on whether a test looks like it measures what it is intended to measure.
Construct validity is established when a test measures the intended hypothetical trait.
External validity focuses on the generalizability of one study to other conditions, individuals, etc.
An investigator uses a factorial ANOVA to assess the effects of two independent variables on a dependent variable and obtains significant main and interaction effects. When interpreting the results of her study, the investigator should interpret the main effect with caution or interpret the interaction since the main effect is significant?
interpret the main effects with caution since the interaction is significant.-
When the interaction is significant, this means that the effects of one independent variable differ for different levels of another independent variable. Thus, it is not possible to conclude that the independent variable has consistent main effects. For example, a study might find that, overall, Teaching Method #1 is superior to Teaching Method #2 (i.e., there is a main effect of teaching method). However, there might also be an interaction between teaching method and level of self-esteem - for example, Teaching Method #1 might be more effective for students with high and moderate self-esteem, while Teaching Method #2 is more effective for students with low self-esteem. In this situation, the main effect of teaching method would have to be interpreted with caution