stata Flashcards
(18 cards)
how can you have detailed information about a variable
by writing codebook [variable]
when you type codebook, what word is the Increment between values; (often 1.000)
range
what should we look at after a codebook?
-what type of value (cat or num)
-unique values: Number of unique, non-missing values
-missing .: Number and percent of missing values (noted as .
how to do a boxplot; a bar chart; a line chart; histogram; density plot
write:
1. graph box [variable] (if you want a condition add: if…
-> For interval variables
2. graph bar, over ([variable])
-> For nominal, ordinal and interval variable
3. line chart: ONLY for numeric and continuous varaibles
4. histogram [variable]
-> for interval variables
5. kdensity [variable]
-> for interval variables
how to see skewness on a graph
PAR RAPPORT to the MEAN
Symmetrical (skew = 0) → balanced on both sides
Positive skew (right-skewed) → tail is longer on the right (more spread values)
Negative skew (left-skewed) → tail is longer on the left
how to make a cross-tabulation
write: tab [DV] [IV], column
-> so DV in the row and IV in the column
how to make a comparison table
write: tabulate [IV], summarize([DV])
visualize cross-tabulation relationships, we often use bar charts. The general syntax is:
graph bar [DV], over([IV])
to visualize mean comparisons, we often use box plots. the general syntax is:
graph box [DV], over([IV])
ewplain and write the commands: sort, and list
*sort:
-Organizes your dataset by the values of one or more variables
-sort [variable]
-ex: age: from youngest to oldest SO if write sort gender age: Sorts first by gender, then by age within each gender
*list:
-Displays selected variables and observations in the Results window.
-list [variable1] [variable2]
what are X, Y and Z
X: IV(s)
Y: DV
Z: CV(s) controlled variable
how to recode when =/ categories, useful to say yes/no
how to recode missing values into true variables?
yes=1 and no=0
recode [variable] (1/3=0 “not a large threat”) (4=1 “large threat”) (miss=.), generate [variable_new]
…(miss=.)
don’t forget the coma after and generate (new name so that you don’t modify the true variable)
how to have the CI
proportion [variable]
attention: read in line the 2 bornes
what does nofreq nokey means
have a better look at the cross-tab
get ride of the frequency in a table (only categories and percentage)
how to make a mean comparison?
tabulate independent_variable, summarize(dependent_variable)