SAS Flashcards

1
Q

what do we call statements that begin/end a step?

A

step boundaries e.g. run;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are global statements?

A

impact the entire SAS session and set options/values that will remain in effect until they are changed or the session ends e.g. title, options, libname, footnote

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how can you comment?

A
  • in front of the line to comment one line
    /* comment */ for any length of text
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are the 3 required attributes of a structured SAS column?

A

name (1-32ch, must start with letter/underscore), length (default numeric=8bytes/16digits, default char=length of column name, can be up to 32767bytes), type (char/numeric)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

how do you create/use a macrovariable?

A

use %LET statement to define macrovariable. WIthin code can then use & to refer to it.
e.g. %LET path = xyz
…&path
SAS will automatically replace path with xyz
macrovariables must have double quotations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

when defining a library, what is the method?

A

LIBNAME libname DATATYPE path libname must be <=8 characters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which procedure produces a report about the descriptor portion of a table?

A

PROC CONTENTS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are rows and columns also called?

A

observations and variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which statement can be used to subset the rows read in a PROC step?

A

WHERE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which procedure lists the distinct values for one or more columns?

A

PROC FREQ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If you want to format just numeric or character types of columns, what keyword would you use?

A

format NUMERIC fmt;
format CHARACTER fmt;

for all columns use ALL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

describe the PROC MEANS step

A

generates simple summary statistics for each numeric column in input data by default unless VAR statement used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

in the PROC MEANS step, what does CLASS, WAYS, OUTPUT and OUT do?

A

CLASS - specifies columns to group by before calculating statistics
WAYS - specifies the number of ways to make unique combinations of class variables
OUTPUT - provides the option to create an output table with specific output statistics
OUT= - names output table to be created

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

describe the PROC FREQ step

A

creates a frequency table for each variable in the input table by default. Can specify variables to be analysed using TABLES statement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

how do you import structured/unstructured data?

A

structured e.g. xlsx use LIBNAME or PROC IMPORT
unstructured e.g. CSV use PROC IMPORT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

in the PROC SORT step, what does NODUPKEY, DUPOUT do?

A

NODUPKEY keeps the first row for each unique value of the columns listed in the by statement
DUPOUT creates an output table containing duplicates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

In general, what are the 2 phases of the DATA step?

A

COMPILATION: creates PDV, establishes data attributed and rules for execution
EXECUTION: reads, manipulates and writes data

18
Q

what are the 2 default variables created in the DATA step?

A

N counts number of iterations through the data step when processing
ERROR initialised at 0 and goes up by 1 when any error detected in each iteration

19
Q

when to use DROP/KEEP vs DROP=/KEEP=?

A

DROP/KEEP added within the data step
DROP=/KEEP= added to a table on the DATA statement or SET statement
- when dropped/kept on SET statement, column is unavailable for processing in the DATA step

20
Q

DO UNTIL vs DO WHILE

A

DO UNTIL: condition checked at bottom of DO loop => always executes at least once
DO WHILE: condition checked at the top of DO loop => doesn’t iterate at all if condition is initially false

21
Q

describe the MERGE statement

A

all tables in the MERGE statement must be sorted by the column listed in the BY statement. it then combines tables where the BY columns match. Can use in= to subset using IF statement.

22
Q

what function do you use to convert
a) numeric to character
b) character to numeric

A

a) PUT(numeric-var, format)
b) INPUT(char-var, informat)

23
Q

how can you create a SAS date value?

A

MDY(month, day, year) function
TODAY() returns current date

24
Q

how can you increase a date by certain interval e.g. to last day of the following month

A

INTNX(interval, start-from, increment, alignment)
e.g. INTNX(‘month’, DATE-col, 1, ‘end’);

25
Q

how can you count the number of e.g. weeks between two dates?

A

INTCK(interval, start-from, end-date, method)
discrete counts weeks from Sundays, continuous counts 7 days as a week

26
Q

describe formats

A

formats are used to change the way values are displayed in data and reports - they do not actually change the underlying data values!

27
Q

what do WHERE statements work on ?

A

only work with columns existing in the input dataset, do not work with calculated columns

28
Q

What does SAS do when it encounters a syntax error?

A

first attempts to correct the error by attempting to interpret what you mean. Then SAS continues processing your program based on its assumptions. If SAS cannot correct the error, it prints an error message to the log.

29
Q

what colour is an error, note and warning message in the log?

A

red, blue, green

30
Q

what does the VALIDVARNAME=V7 option allow you to do?

A

-up to 32 characters.
-first character must begin with a letter or underscore. Subsequent characters can be letters of the Latin alphabet, numerals, or underscores.
- Trailing blanks are ignored.
- cannot contain blanks or special characters except for the underscore.
- can contain mixed-case letters. SAS stores and writes the variable name in the same case that is used in the first reference to the variable. However, when SAS processes a variable name, SAS internally converts it to uppercase. Therefore, you cannot use the same variable name with a different combination of uppercase and lowercase letters to represent different variables. For example, cat, Cat, and CAT all represent the same variable.
- Do not assign variables the names of special SAS automatic variables (such as N and ERROR) or variable list names (such as NUMERIC, CHARACTER, and ALL) to variables.

31
Q

what does the VALIDVARNAME=ANY option allow you to do?

A
  • can begin with any character, including blanks
  • leading blanks preserved, trailing blanks ignored
32
Q

whats the difference between formats and informats?

A

informats tell SAS how to read data, formats how to write/print data

33
Q

when do we add 1 to the number of times do loop has been executed?

A

when we don’t have an explicit output. With explicit output after each iteration we see e.g. 4 outputs. Without, SAS outputs after its gone through the last iteration which signalled it should stop e.g. 5 where it would have the same value as the one before (4). We only see the one row with iteration value of 5.

34
Q

what are the rules for creating a format name ?

A
  • limited to 32 characters.
  • Character format names must start with a dollar sign followed by a letter or underscore.
  • Numeric format names must start with a letter or underscore.
35
Q

Define NOOBS

A

Suppress the column in the output that identifies each observation by number

36
Q

how can we write to the PUTLOG?

A

PUTLOG ALL; writes all columns/values in PDV to log
PUTLOG column=; writes selectee columns
PUTLOG “message”; writes a text string to log

37
Q

In DATA step processing, what are n and error?

A

The value of N represents the number of times the DATA step has iterated. N is initially set to 1. Each time the DATA step loops past the DATA statement, the variable N increments by 1.
ERROR is 0 by default but is set to 1 whenever an error is encountered, such as an input data error, a conversion error, or a math error, as in division by 0 or a floating point overflow.

38
Q

What are the steps of the Compilation phase?

A
  1. Checks for syntax errors
  2. Creates the program data vector (PDV)
  3. Establishes the rules for processing data in the PDV e.g. flags which columns to be dropped/kept
  4. Creates the descriptor portion of the output table
39
Q

What are the steps of the Execution phase?

A
  1. Initialise the PDV
  2. Read a row from the input table into the PDV
  3. Sequentially processes statements and updates values in the PDV
  4. At the end of the step, writes the contents of the PDV to the output table
  5. Returns to top of DATA step
40
Q

when can you use IF vs WHEN subsetting?

A

Subsetting IF statements can only appear in DATA steps. In SAS, WHERE statements can be used in both DATA and PROC steps.

41
Q

when using MONTH() function, what does it return?

A

The MONTH function extracts the numeric month value from the specified date value. The MONTH function returns only the integer 1-12 without a leading zero.