Session 4 Flashcards
(9 cards)
What are the two approaches that you can use to estimate the payoff from pursuing an MBA?
- Cross Sectional Analysis (one person, one data point. Control other variables to narrow down the effect of an MBA)
- Panel Data Analysis (one person, one data point for each period in time.
What are the two purposes of building analytical models? How will your focus on interpreting the results change based on your purpose?
To discover relationships - Focus on R-squared values and P-values to determine explanation of variance and significance of the correlation.
Prediction and Forecasting - Focus on dependent variable results, compare to historical data to project results in the future
Understand how to interpret the results of cross-sectional analysis shown on Page 10 of the handout.
Coefficient - estimated effect of a unit change in X
Standard Error - a measure of the error in the coefficient estimate
t value - ratio of coefficient estimate and standard error
P value - probability of obtaining the sample if the true coefficient was zero
R-squared - portion of variance in X explained by the model
How do you interpret indicator (dummy) variables like the MBA variable on Page 10
If MBA = 0, then the coefficient for this variable is 0, but if MBA = 1, then the coefficient displayed is the value added to wage (constant) if you have an MBA.
How do you interpret “interaction terms?” See example on Page 11
Interaction terms are needed when you have two variables that may have some sort of combined impact. Check the P value for significance.
How do you analyze ordinal independent variables? See examples on Page 12 and 13.
Ordinal Independent Variables are categorical variables, like male/female or FTMBA/PTMBA. The value of # has no relationship to the non-ordinal variables, it functions just as a category.
It might be more beneficial to have a category like “AnyMBA” and “FTMBA”
What are the advantages of panel data methods we discussed in class? Understand how to interpret the results shown on Page 15.
Using Panel Data is a good way to eliminate unmeasurable data like motivation, personality, etc.
You are really trying to isolate a specific variable.
All non-time varying variables are contained within the constant.
Con of this method is that the no non-fixed variables (Male vs. Female) are contained.
Know what “within” and “between” mean.
What are the different types of models and when are they used?
OLS Regression - when dependent variable is normally distributed and continuous. (e.g. wages)
Logit Regression - dependent variable is 1 or 0. (e.g. election outcome)
Poisson Regression - dependent variable is a count or an integer (e.g. # of days absent)
Ordinal Regression - dependent variable is ordinal -can take finite set of ordered values (e.g. size of drink order)
Cox Proportional Hazard Model - dependent variable is teh time to an event (e.g. patient survival)
What are the 5 key assumptions in the models we discussed?
- Sample is representative (selection bias can occur, models are never perfect)
2-4. Variables are
- accurately measured
- no variables that are significant are omitted
- equation form (e.g.linear) matches underlying relationship
- error (lack of fit) is random
most of the sophisticated methods try to address these data limitations BUTYou can correct these problems by collecting better data.