2.2 & 2.3 Flashcards

1
Q

Which of the following is true regarding the values of computed columns during the execution phase of a DATA step?

a) values of computed columns are recalculated and previous values overwritten for each row in the input data set
b) by default all computed columns are reset to missing when the PDV is reinitialized
c) you cannot use a DATA step to create an accumulating column

A

b) by default all computed columns are reset to missing when the PDV is reinitialized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
Which of the following DATA steps successfully creates an accumulating column for YTDRain?
a) data houston2017;
       set pg2.weather_houston;
       retain YTDRain 0;
       YTDRain=YTDRain+DailyRain;
   run;
b) data houston2017;
       set pg2.weather_houston;
       retain YTDRain;
       YTDRain=YTDRain+DailyRain;
   run;
c) a) data houston2017;
       set pg2.weather_houston;       
       YTDRain=YTDRain+DailyRain;
       retain YTDRain 0;
   run;
d) None of the above.  You cannot create an accumulating column in a DATA step.
A
a) data houston2017;
       set pg2.weather_houston;
       retain YTDRain 0;
       YTDRain=YTDRain+DailyRain;
   run;

The RETAIN statement is a compile-time statement that sets a rule for one or more columns to keep their value each time the PDV is reinitialized, rather than being reset to missing. It also provides the option of establishing an initial value in the PDV before the first iteration of the DATA step.

RETAIN column ;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Which of the following is NOT true regarding the SUM statement?

a) The SUM statement syntax is column+expression, where the accumulating column is to the left of the + sign
b) The SUM statement automatically sets the initial value of the accumulating column to 0;
c) The RETAIN statement is required in order for the SUM statement to work properly.
d) The SUM statement adds the value of the column or constant to the right of the plus sign to the accumulating column for each row.
e) The SUM statement ignores missing values.

A

c) The RETAIN statement is required in order for the SUM statement to work properly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
What sum statement would you add to this program to create the column named DayNum, which increments by 1 for each row in the input data set?
data zurich2017;
   set pg2.weather_zurich;
   YTDRain_mm+Rain_mm;
   ???
run;
a) SUM(DayNum, 1);
b) DayNum + 1;
c) retain DayNum 1;
d) DayNum+Rain_mm;
A

b) DayNum + 1;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What step is needed in order to process data in groups?

a) RETAIN
b) ORDER BY
c) SORT
d) BY

A

d) BY

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which of the following is true when processing data in groups in a DATA step?

a) It is not necessary to sort the data by the desired groups
b) Two special columns, FIRST.by-column and LAST.by-column, are added to the PDV.
c) The FIRST. and LAST. variables are permanent and will be added to the output table by default.
d) The FIRST. variable is 1 for the first row within a group, and . for all other rows.

A

b) Two special columns, FIRST.by-column and LAST.by-column, are added to the PDV.

During the execution phase, the FIRST. and LAST. variables are assigned a value of 0 or 1. The FIRST. variable is 1 for the first row within a group, and 0 for all other rows. Similarly, the LAST. variable is 1 for the last row within a group, and 0 for all other rows.

These temporary variables contain important information that you can use before they are dropped when a row is written to the output table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

True/False - You can use the FIRST. and LAST. variables, along with the BY and WHERE statements to subset rows during the execution phase of the DATA step.

A

False - The WHERE statement is a compile-time statement that establishes rules about which rows are read INTO the PDV. Therefore, the WHERE express must be based on columns that exist in the input table referenced in the SET statement. The FIRST. and LAST. variables in not in the input table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

True/False - If multiple columns are listed on the BY statement in the DATA step, then each column has its own FIRST./LAST. variables in the PDV.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Summarizing data within groups can be performed in the DATA step or in procedures such as PROC MEANS. What are some examples of when you might choose to use either the DATA step or PROC MEANS?

a) The DATA step enables you to do other calculations or manipulations at the same time summarizations occur.
b) PROC means is more complex to code, but offers more statistics
c) The DATA step is better for very large data sets.
d) Both are equivalent and there use depends on personal preference.

A

a) The DATA step enables you to do other calculations or manipulations at the same time summarizations occur.

b - PROC MEANS might be simpler to code, and it is easy to request various statistics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which of the following statements about SAS functions are true? List all that apply.

a) SAS functions are named, predefined processes which can be used to produce a value
b) A function must include at least 1 argument as input
c) Based on the arguments, the function performs its specified computation or manipulation and returns a value.
d) In the SAS documentation, functions and call routines are grouped by category.

A

a, c, d

b - The function can accept none, one, or several arguments as input.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

True/False - Column lists can help you reduce the number of columns you have to specify in function arguments or in other SAS statements.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
Suppose you have a data set with numeric columns: Quiz1, Quiz2, Quiz3, Quiz4, and Quiz5. You want to write a data step that will calculate the average of these columns and format all numeric columns in the data set as 3.1.  Which data step below will accomplish this? List all that apply.
a) data quiz_summary;
        set pg2.class_quiz;
        AvgQuiz = mean(Q:);
        format Q: 3.1;
   run;
b) data quiz_summary;
       set pg2.class_quiz;
       AvgQuiz = mean(of Q:);
     format of Quiz1-AvgQuiz 3.1;
   run;
c) data quiz_summary;
        set pg2.class_quiz;
        AvgQuiz = mean(of Q:);
        format Quiz1--AvgQuiz 3.1;
     run;
d) data quiz_summary;
        set pg2.class_quiz;
        AvgQuiz = mean(of Q:);
        format _numeric_ 3.1;
     run;
A

c, d

b) You don’t need to use the OF keyword in the FORMAT statement. The OF keyword is required when you use column lists as arguments in a function or call routine.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
Which of the following are keywords that can be used to specify groups of columns and eliminate the need to write them all out?
a) _NUMERIC_
b _CHARACTER_
c) _ALL_
d) _NONE_
e) a, b and c
A

e

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Which of the following is true regarding the CALL SORTN routine?

a) The routine takes the columns provided as arguments and reorders them according to the numeric values in the rows.
b) The routine takes the columns provided as arguments, and reorders the numeric values for each row from low to high.
c) Using CALL before the SORTN is optional.
d) The routine assigns the lowest value to a new column.

A

b)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What SAS function could you use to assign a random number to each record in a new variable?

a) RAND
b) RANDOM
c) RANGE
d) LARGEST

A

a) RAND

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

True/False - The LARGEST function will identify the value from the provided arguments with the highest value.

A

False - The LARGEST function returns the k-th largest value with k being the first argument provided in the function.

LARGEST(k, value-1)
value - specifies the numeric constant, variable, or expression to be processed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Given the data set and data step below, what will the LARGEST function return?

Quiz1 Quiz2 Quiz3 Quiz4 Quiz5
1 2 3 4 5
1 6 7 4 5

data quiz_analysis;
set pg2.class_quiz;
Quiz1st = largest(1, of Quiz1-Quiz5);
run;

a) 7
b) 5 for the first observation, 7 for the second observation
c) 7 for both observations
d) The code will error because the second argument is not specified correctly

A

b) The largest function will get the maxiumum score from the columns Quiz1 through Quiz5.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Which of the following statements will find the average value of Quiz1, Quiz2, and Quiz3 and round the result to the nearest tenth?

a) round(mean(Quiz1, Quiz2, Quiz3), .1);
b) round(mean(Quiz1, Quiz2, Quiz3));
c) mean(round(Quiz1, Quiz2, Quiz3), .1);

A

a)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What will this function return:
CEIL(1.99999)

a) 2
b) 1
c) 1.9
d) 1.99999

A

a) 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q
What will this function return:
FLOOR(1.99999)
a) 2
b) 1
c) 1.9
d) 1.99999
A

1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q
What will this function return:
FLOOR(1.99999)
a) 2
b) 1
c) 1.9
d) 1.99999
A

b) 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q
What will this function return:
INT(1.9999)
a) 2
b) 1
c) 1.9
d) 1.9999
A

b) 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

True/False - a datetime value in SAS is stored as the number of seconds from midnight on January 1, 1960.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What arguments can the INTCK function take?

a) (‘interval’, start-date, end-date )
b) (interval, start-date, end-date, ‘method’)
c) (start-date, end-date, ‘interval’, ‘method’)
d) none of the above

A

a) (‘interval’, start-date, end-date )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is the standard interval bounding when using the INTCK function?

a) Week begins on Sunday and ends on Saturday
b) Week begins on the start-date argument and ends 7 days later
c) Week begins on a Monday and ends on a Sunday

A

a) Week begins on Sunday and ends on Saturday

26
Q

When using the INTCK function what argument do you need to add if you want a continuous count from the start date?

a) ‘continuous’
b) ‘b’
c) ‘c’
d) no argument is needed, SAS performs a continuous count by default

A

c) ‘c)

27
Q

Given the data below, what value would be assigned to Months2Pay for the expression listed?

ServiceDate PayDate Months2Pay
10Jul2018 05SEP2018 PayDate

Months2Pay=intck(‘month’, ServiceDate, PayDate);

a) 1
b) 2
c) 3
d) the code will error because the fourth argument is not specified in the intck function

A

b) 2 - using the default discrete method, Months2Pay is 2. Two end-of-month boundaries were crossed (the end of July and the end of August).

28
Q

Given the data below, what value would be assigned to Months2Pay for the expression listed?

ServiceDate PayDate Months2Pay
10Jul2018 05SEP2018 PayDate

Months2Pay=intck(‘month’, ServiceDate, PayDate, ‘c’);

a) 1
b) 2
c) 3
d) the code will error because the fourth argument is not specified in the intck function

A

2) Using the continuous method, Months2Pay is 1. One month boundary was crossed at August 10. The next boundary does not occur until September 10.

29
Q

True/False - The INTCK function can be used to adjust or shift date values.

A

False - The INTNX function can be used to adjust or shift date values.

30
Q

If you need to create a new variable based on an existing variable StartDate, where the new variable is the first day of the month for each value within StartDate, which syntax would accomplish this?

a) intnx(‘month’, Date, 0)
b) intnx(‘month’, Date, 0, ‘start’)
c) intck(‘month’, Date, 0)
d) intck(‘month’, Date, 0, start’)

A

a) intnx(‘month’, Date, 0)

31
Q

What value would be returned from this function?

intnx(‘year’, ‘29feb2000’d, 2, ‘same’)

a) 29feb2002
b) 28feb2002
c) error

A

b) 28feb2002

32
Q

Which of the functions below returns a character string with all multiple blanks in the source string converted to single blanks?

a) STRIP(string)
b) COMPRESS(string )
c) COMPBL(string)
d) SCAN(string, n )

A

c) COMPBL

33
Q

Which of the functions below returns a character string with specified characters removed from the source string?

a) STRIP(string)
b) COMPRESS(string )
c) COMPBL(string)
d) SCAN(string, n )

A

b) COMPRESS

34
Q

Which of the functions below returns a character string with leading and trailing blanks removed.

a) STRIP(string)
b) COMPRESS(string )
c) COMPBL(string)
d) SCAN(string, n )

A

a) STRIP

35
Q

Which of the functions below can be used to extract a particular word in sequence from a string?

a) STRIP(string)
b) COMPRESS(string )
c) COMPBL(string)
d) SCAN(string, n )

A

d) SCAN

36
Q

Given the data below, what will the FIND function return?

Station
Raleigh Durham Airport 16

AirportLoc = find(Station, ‘Airport’);

a) 16
b) 14
c) Raleigh Durham Airport
d) 1
e) 0

A

a) 16

The FIND function does a case-sensitive search in the values of the input column and returns the number that indicates the start position of the substring within the string.

37
Q

What optional modifiers can be used with the FIND function in the third argument?

a) ‘C’ or ‘E’
b) ‘I’ or ‘T’
c) ‘I’ or ‘F’
d) ‘UPCASE’, ‘TRIM’

A

b) ‘I’ - makes the search case insensitive; ‘T’ - trims leading and trailing blanks.

38
Q

Which of the following functions returns the length of a non-blank character string, excluding trailing blanks; returns 1 for completely blank string?

a) LENGTH(string)
b) ANYDIGIT(string)
c) ANYALPHA(string)
d) ANYPUNCT(string)

A

a) LENGTH(string)

39
Q

What would the following function return?

ANYDIGIT(‘Mand8r’)

a) 8
b) 5
c) 4

A

b) 5

ANYDIGIT(string) returns the first position at which a digit is found in the string.

40
Q

What would the following function return?

ANYALPHA(‘2000Beefcake’)

a) 5
b) Beefcake
c) B
d) 2000

A

a) 5

ANYALPHA(string) returns the first position at which an alpha character is found in the string.

41
Q

True/False - The ANYPUNCT(string) will return a value of 1 if the input string contains any punctuation marks, or 0 if it does not.

A

False - The ANYPUNCT(string) returns the first position at which punctuation character is found in the string.

42
Q

What three arguments are needed to use the TRANWRD function to find and replace a character string?

a) TRANWRD(find, replace, target)
b) TRANWRD(replace, source, target)
c) TRANWRD(source, target, replacement)

A

c) TRANWRD(source, target, replacement)

The first argument is generally a character column, the second argument is the target, or the string you want to find, the third argument is the string that replaces the target.

43
Q

Given the data below, what would the CAT function return?

Name Pet
Amanda Onyx

CAT(Name, Pet)

a) AmandaOnyx
b) Amanda Onyx
c) Amanda Onyx

A

a) AmandaOnyx

The CAT function concatenates strings together, does not remove leading or trailing blanks

44
Q

Given the data below, what would the CATS function return?

Name Pet
Amanda Onyx

CATS(Name, Pet)

a) AmandaOnyx
b) Amanda Onyx
c) Amanda Onyx

A

a) AmandaOnyx

Concatenates strings together, removes leading or trailing blanks from each string.

45
Q

Given the data below, what would the CAT function return?

Name Pet
Amanda Onyx

CATX(‘ ‘, Name, Pet)

a) AmandaOnyx
b) Amanda Onyx

A

b) Amanda Onyx

Concatenates strings together, removes leading or trailing blanks from each string, and inserts the delimiter between each string.

46
Q

True/False - The INPUT function converts a character value to a numeric value by using an informat to indicate how the character string should read.

A

True

47
Q

True/False - The PUT function converts a character value to a numeric value by using an informat to indicate how the character string should read.

A

False - the PUT function converts numeric values to character values by using a format to indicate how the values should be written.

48
Q

What will be the value returned by the input function below?

NewVolume2=input(5,976,252, comma12.2);

a) 5976252
b) 59762.52
c) 5,976,252
d) 5,976,252.00

A

b) 59762.52

The return value includes a decimal place two positions from the right because there was no decimal point in the original value.

49
Q

Which of the following is NOT true regarding the ANYDTDTE informat?

a) It can read a variety different text strings values as a date.
b) It can take longer to process than other informats.
c) It uses the DATESTYLE= option to interpret ambiguous values
d) The default sequence is MDY

A

d) The default value for the DATESTYLE= option is LOCALE so the sequence is based on the LOCALE= system option. If LOCALE= is set to English, then DATESTYLE sequency is MDY, and 6/1 is read as June 1st.

50
Q

Which statement renames the existing column Product in sashelp.shoes as Type?

a) set sashelp.shoes rename=(Type=Product);
b) set sashelp.shoes (rename=(Type=Product));
c) set sashelp.shoes (rename(Product=Type));
d) set sashelp.shoes (rename=(Product=Type));

A

d) set sashelp.shoes (rename=(Product=Type));

51
Q

True/False - Functions and Call Routines both return a value that must be used in an assignment statement or expression.

A

False - A function returns a value that must be used in an assignment statement or expression, but a CALL routine alters existing column values or performs other system functions.

52
Q

Which function calculates the average of the columns Week1, Week2, Week3, and Week4?

a) mean(Week1, Week4)
b) mean(Week1-Week4)
c) mean(of Week1, Week4)
d) mean(of Week1-Week4)

A

d) mean(of Week1-Week4)

Numeric column lists are specified with a hyphen between the first and last columns in the range. The keyword OF must be used if a column list is used as an argument in a function.

53
Q

Which expression rounds each value of Sales to the nearest hundredth (or two decimal places)?

a) round(Sales)
b) round(Sales, 2)
c) round(Sales, .01)
d) round(Sales, dollar10.2)

A

c) round(Sales, .01)

Use the second argument in the ROUND function to specify the rounding unit.

54
Q

Which function could be used to remove the non-numeric symbols in Phone?

Phone
202-555-0910
202.555.0110
(202)555-0133

a) COMPRESS
b) COMPBL
c) SCAN
d) FIND

A

a) COMPRESS

The second argument of the COMPRESS function can be used to specify all symbols to remove from the values of Phone:

COMPRESS(Phone, ‘-,())

55
Q

Which statement reads CityCountry and correctly assigns a value to Country?

CityCountry Country
Athens, Greece Greece
New Delhi, India India
Auckland, New Zealand New Zealand

a) cat(City, “, “, Country)
b) cats(“, “, City, Country)
d) catx(“, “, City, Country)

A

d) catx(“, “, City, Country)

56
Q

How many rows are written to output based on the following statement?

if find(Location, “Oahu”, “i”) >0 then output;

Location
Honolulu, Oahu
Kaanapali, Maui
Hilo, Hawaii
kailua, oahu
Laie, OAHU

a) 0
b) 1
c) 3
d) 5

A

c) 3

The “I” modifier as the third argument in the FIND function makes the search case insensitive.

57
Q

Which of the following functions can convert the values of the numeric variable Level to character values?

a) put(Level, 3)
b) put(3., Level)
c) input(3., Level)
d) input(Level, 3.)

A

a) put(Level, 3)

The PUT function explicitly converts numeric values to character values. You specify the keyword PUT followed by the variable name and then the format. The variable name and format are enclosed in parentheses and separated by a comma.

58
Q

Which of the following functions converts the character values of Base to numeric values?

a) put(comma10.2, Base)
b) put(Base, comma10.2)
c) input(Base, comma10.2)
d) input(comma10.2, Base)

A

c) input(Base, comma10.2)

59
Q

Which step is not required when converting a character column named Date to a numeric column with the same name?

a) Rename the Date column to a new name, such as CharDate.
b) Use the INPUT function to read the renamed CharDate character column and create a numeric column named Date.
c) Specify an appropriate informat in the INPUT function.
d) Format the new numeric Date column.

A

d) Format the new numeric Date column.

Formatting the new column not not required but recommended.

60
Q

Which of the below is the correct syntax for the SUBSTR function?

a) SUBSTR(char, position )
b) SUBSTR(position, char )
c) SUBSTRNG(position, char, length)
d) SUBSTR(char, position, length)

A

a) SUBSTR(char, position )