Exam 1 Flashcards
(28 cards)
Types of variables and their definitions
Continuous: variable value can be measured along a numerical continuum (height, weight)
Categorical: variable value form category, either unordered (0 for male and 1 for female) or ordered (0 for low, 1 for medium, and 2 for high).
Define observation
A set of data values for the same subject (name, age, height, weight, gender of one subject).
Define a SAS data set:
a collection of observations
Define a SAS statement
a command instructing SAS to perform certain action
Define a SAS program
a set of SAS statements designed to perform a specific task
List and define the four windows in SAS
Program (editor) window: This is where the program is written, edited and submitted.
Log window: It records the submitted program, including warnings and error message.
Output window: LISTING output from the submitted program. This is the default in SAS 9.2 and earlier versions.
Results Viewer: HTML output from the submitted program. This is the default in SAS 9.3.
How to type in data?
DATA temp; Input \_\_\_\_ \_\_\_\_ \_\_\_\_ ect ; datalines; (or cards;) . . . .
;
run;
How to indicate that a variable is a character in SAS
in the input line type the name of the variable $
How to print the results
proc print data=___ (obs=___); run;
note: only type the (obs=___) part if you are trying to print the first ___ observations
List the naming conventions
SAS program: use extension ‘.sas’
SAS log: use extension ‘.log’
SAS output: use extension ‘.out’
Results Viewer: use extension ‘.mht’
What are the 7 descriptive statistic commands (from section 1) in SAS?
proc contents
proc means
proc univariate
proc freq
proc corr
proc plot
proc gchart
How is the descriptive statistic command “proc contents” command written and what does it do?
proc contents data=___ ; run;
(Tells us the number of observations, Variables, Indexes, Observation length, Deleted observations, and if its compressed or sorted. Also use this to find out how the data is coded. Is it numerical or something else? The variable name and the #s must be the same code.)
How is the descriptive statistic command “proc means” command written and what does it do?
proc means data=___; run;
Gives us the means, Std Dev, Minimum, and Maximum for the variables
How is the descriptive statistic command “proc univariate” command written and what does it do?
proc univariate data=___;
var ___;
run;
(the var ___ specifies which variable you are interested in. The command gives us the statistics for that specific variable.)
How is the descriptive statistic command “proc freq” command written and what does it do?
proc freq data=___;
tables ____;
run;
(gives a frequency table of the selected variable. The columns in the table include: variable observation number, frequency, percent, cumulative frequency, and cumulative percent).
How is the descriptive statistic command “proc corr” command written and what does it do?
proc corr data=___;
var ___ ___ ___ ect ;
run;
(this program allows you to test for any correlation between the selected variables. Gives the N, mean, std dev, sum, min, and max for each of the variables. Also reports the pearson correlation coefficients between all the selected variables).
How is the descriptive statistic command “proc plot” command written and what does it do?
proc plot data=\_\_\_; plot variable Y * variable X='*'; plot variable Y * variable X=\_\_\_; title "\_\_\_\_\_\_\_\_"; run;
It gives us an X by Y scatter plot.
How is the descriptive statistic command “proc gchart” command written and what does it do?
proc gchart data=auto ;
vbar price / levels= ___ (some number);
title “bar chart for price”;
run;
or
proc gchart data=auto;
pie make / sumvar= repair type=mean;
title “pie chart for make of mean of repair by make”
run;
It gives us charts of the variables selected.
Describe what the proc reg command looks like and what it does
proc reg data=____
___ ___ = ____ ____ ____ …;
title “ regression analysis of ……”;
run;
the output gives us a regression output for the variables selected.
Define and explain internal data sources
Internal Data Sources: Read data from within a SAS program using “datalines” or “cards” statement. The general format is:
data data_set_name; infile dataline options; input variable_name_list; datalines; . . . .
;
run;
How to read data from internal data sources as free formatted data (space delimited)
Notes: the infile statement is dropped in this example because space is the default delimiter
DATA cars1; INPUT make $ model $ mpg weight price; DATALINES; GMC Concord 22 2930 4099 GMC Pacer 17 3350 4749 GMC Spirit 22 2640 3799 Buick Century 20 3250 4816 Buick Electra 15 4080 7827 ; RUN;
How to read data from internal data sources as free formatted data (comma delimited)
DATA cars2; INFILE datalines delimiter=','; INPUT make $ model $ mpg weight price; DATALINES; GMC, Concord, 22, 2930, 4099 GMC, Pacer, 17, 3350, 4749 GMC, Spirit, 22, 2640, 3799 Buick, Century, 20, 3250, 4816 Buick, Electra, 15, 4080, 7827 ; RUN;
How to read data from internal data sources as free formatted data (tab delimited)
DATA cars3;
INPUT make $ model $ mpg weight price;
DATALINES;
GMC Concord 22 2930 4099
GMC Pacer 17 3350 4749
GMC Spirit 22 2640 3799
Buick Century 20 3250 4816
Buick Electra 15 4080 7827
;
RUN;
How to read data from internal data sources as fixed formated data
Usually, because there are no delimiters (such as spaces, commas, or tabs) to separate fixed formatted data (hence no need for infile statement in the following example), column definitions are required for every variable. This also requires the data to be in the same columns for each subject.
DATA cars4; INPUT makes $ 1-5 model $ 6-12 mpg 13-14 weight 15-18 price 19-22; CARDS; GMC Concord2229304099 GMC Pacer 1733504749 GMC Spirit 2226403799 BuickCentury2032504816 BuickElectra1540807827 ; RUN;