stats 10 Flashcards
Binary variables are usually coded as
0 or 1
dummy variable trap
perfect multicollinearity that results from the inclusion of dummy variables representing each possible value of a categorical variable
Perfect multicollinearity
when there is an exact linear relationship between any two or more of a regression model’s independent variables.
The coefficient for the binary X variable indicates the difference in the Y variable between the respective category and the
reference category (the one omitted in the dummy coding)
This coefficient provides insights into
how the response varies across different groups
Reference category
in a regression model, the value of a categorical independent for which we do not include a dummy variable
Categorical independent variables can be used in
interactions
Interactive models
multiple regression models that contain at least one independent variable that researchers create by multiplying together two or more independent variables
Use an interaction model in multiple regression if
you suspect that the effect of one independent variable on the dependent variable varies depending on the level of another independent variable
A significant interaction effect between income and voter status indicates that the increase in donations with income is greater for voters than for nonvoters, suggesting that voter status —- the relationship between income and donations
moderates
Moderation
the alteration of the relationship between two variables by a third variable, indicating that the effect of one variable on an outcome changes depending on the level or category of the modifying variable
Interaction effects can be modeled between — of two categorical variables, two numeric variables, or one of each.
any combination
Interaction Between a Categorical and a Numeric
Variable
The effect of a numeric variable on the dependent variable is modified by a categorical variable
Interaction Between Two Categorical Variables
In this case, the interaction term assesses how the effect of one categorical variable on the dependent variable changes based on the levels of another categorical variable
Interaction Between Two Numeric Variables
Here, the interaction term assesses how the relationship between one numeric variable and the dependent variable changes at different levels of another numeric variable
When an exposure and an outcome independently
cause a third variable, that variable is termed a —
‘collider’.
Inappropriately controlling for a collider variable, by study design or statistical analysis
results in collider bias
Influential case
a case in a regression model
which has either a combination of large leverage and a large squared residual or a large DFBETA score
An influential case can be influential if it has large
leverage.
Leverage
in a regression model, the degree to which an individual case is unusual in terms of its value for a single independent variable, or its particular combination of values for two or more independent variables
A case can be influential if it has a large
squared residual value.
A large residual value indicates that the
observed data point deviates markedly from
the predicted outcome
A case can be influential if it has both
large leverage and a large squared residual value
DFBETA is a diagnostic measure used in regression analysis to
assess the influence of
individual data points on the estimated coefficients
of the model. It quantifies the change in each regression coefficient when a specific observation is removed from the dataset.