2.1 Flashcards

1
Q

Which of the following can be done using a SAS data step?

a) read and write tables
b) filter rows
c) compute columns
d) conditionally process
e) subset columns
f) all of the above

A

f) all of the above

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

True/False - In the compilation phase of a SAS DATA step, SAS prepares the code and establishes data attributes and the rules for execution.

A

True - In the execution phase, SAS follows these rules to read, manipulate, and write data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the first thing SAS does in the compilation phase of the DATA step?

a) Creates the PDV
b) Creates the output table metadata
c) Runs through your program to check for syntax errors
d) Reads the first row of data into the PDV

A

c) Runs through your program to check for syntax errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

True/False - In the compilation phase of the SAS DATA step, SAS builds a critical area of memory called the Program Data Vector (PDV)

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which of the following is not part of the SAS compilation phase?

a) PDV is used to hold and manipulate data
b) SAS establishes rules for the PDV based on your code, such as which columns will be dropped, or which rows from the input table will be read
c) SAS creates the descriptor portion, or table metadata
d) All of the above are part of the compilation phase

A

a) PDV is used to hold and manipulate data. This is part of the execution phase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Given the DATA step below, what will be the length of the Ocean column?

data storm_complete;
set pg2.storm_summary_small;
length Ocean $ 8.;
if substr(Basin, 2, 1) = “1” then Ocean=”Indian”;
else if substr(Basin, 2, 1) = “A” then Ocean=”Atlantic”;
else Ocean=”Pacific”;
run;

a) 8
b) 6
c) 32
d) the code will error

A

a) 8 - In this code, the LENGTH statement defines the character column Ocean with a length of 8. If the LENGTH statement was placed after then IF-THEN statements, SAS would have used the assignment statement OCEAN=”Indian” to define Ocean with a length of 6.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

True/False - Any column in a DROP statement of a DATA step will not be added and read into the PDV.

A

False - The DROP statement does not remove a column from the PDV. Instead, SAS marks the column with a drop flag so that it’s dropped later in execution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which of the following statements are ‘compile-time statements’ and are NOT executed for each row in a table during the execution phase of a SAS DATA step?

a) WHERE
b) LENGTH
c) FORMAT
d) DROP

A

c) FORMAT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

True/False - You can use an explicit OUTPUT statement in the DATA step to force SAS to write the contents of the PDV to the output table at specific points in the program. There will still be an explicit OUTPUT statement at the end of the DATA step.

A

False - If you use an explicit OUTPUT statement anywhere in a DATA step, you have taken control of the output and there is no implicit OUTPUT at the conclusion of the DATA step. The implicit RETURN at the end of the DATA step still returns processing to the top of the DATA step.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
Given the DATA step below, which DATA statement syntax will drop the Returns column from the sales_high table and the Inventory column from the sales_low.
a) data sales_high (drop Returns)
            sales_low (drop Inventory);
b) data sales_high (drop=Returns)
             sales_low (drop=Inventory);
c) data sales_high sales_low;
            drop Returns Inventory;
d) data sales_high / drop=Returns
            sales_low / drop=Inventory;
A

b) data sales_high (drop=Returns)

sales_low (drop=Inventory);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Which of the statements below is true?

a) When you use a DROP= or KEEP= data set option on a table in the SET statement, the excluded columns are not read into the PDV, so they are not available for processing.
b) When you use a DROP= or KEEP= data set option on a table in the DATA statement, the excluded columns are not read into the PDV, so they are not available for processing.

A

a) When you use a DROP= or KEEP= data set option on a table in the SET statement, the excluded columns are not read into the PDV, so they are not available for processing.

When you use a DROP or KEEP statement or a DROP= or KEEP= data set option in the DATA statement, columns are included in the PDV and CAN be used for processing. They are flagged to be dropped when an implicit or explicit OUTPUT is reached.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Which statement is false concerning the compilation phase of the DATA step?

a) Initial values are assigned to the columns.
b) The program data vector (PDV) is created.
c) The DATA step is checked for syntax errors.
d) The descriptor portion of the output table is created.

A

a) Initial values are assigned to the columns.

Initial values are assigned to columns at the beginning of the execution phase.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Which statement is NOT a compile-time-only statement?

a) KEEP
b) LENGTH
c) SET
d) WHERE

A

c) SET

At execution time, the SET statement is processed to read data into the PDV. The compile-time statements of KEEP, LENGTH, and WHERE are not processed at execution time. The rules of these statements are processed in the compilation phase so that their impact will be observed in the output table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Which statement is true concerning the execution phase of the DATA step?

a) Data is processed in the program data vector (PDV)
b) An implied OUTPUT occurs at the top of the DATA step
c) An implied REINITIALIZE occurs at the bottom of the DATA step
d) Columns read from the input table are set to missing when SAS returns to the top of the DATA step.

A

a) Data is processed in the program data vector (PDV)

During execution, data manipulation occurs in a PDV. An implied OUTPUT and RETURN (not REINITIALIZE) occurs at the bottom of the DATA step. When SAS returns to the top of the DATA step, columns read from the input table are retained and computed columns are set to missing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

True/False - The DATA step debugger in SAS Enterprise Guide can be used with DATA and PROC steps.

A

False - The DATA step debugger in SAS Enterprise Guide works only with DATA steps.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which PUTLOG statements create the following results in the SAS log?
Name=Alfred Height=69 Weight=112.5 Ratio=0.61 ERROR=0 N=1 Ratio=0.61
a) putlog all; putlog Ratio;
b) putlog all; putlog Ratio;
c) putlog all; putlog Ratio;
d) putlog all; putlog Ratio=;

A

d) putlog all; putlog Ratio;

ALL is a keyword to show all of the contents in the PDV. Ratio= writes out the column name, an equal sign, and the value of Ratio. Ratio writes out only the value.

17
Q

How many rows and columns are in the output table ShippingZones given the following information?

The input table Shipping contains 5 rows and 3 columns (Product, BoxSize, and Rate).

data ShippingZones;
   set Shipping;
   Zone=1;
   output;
   Zone=2;
   Rate=(Rate*1.5);
run;

a) 5 rows and 3 columns
b) 5 rows and 4 columns
c) 10 rows and 3 columns
d) 10 rows and 4 columns

A

b) 5 rows and 4 columns

The explicit OUTPUT statement is sending the ZONE=1 rows to the output table. There is no explicit OUTPUT statement after ZONE=2, so those rows are not making it to the output table. An implicit OUTPUT is not at the bottom of the DATA step due to the explicit OUTPUT. The four columns are Product, BoxSize, Rate and Zone.

18
Q

The sashelp.cars table contains 428 rows: 123 with Origin equal to Europe and 305 with Origin equal to other values.

data Europe Other;
   set sashelp.cars;
   if Origin='Europe' then
       output Europe;
   output Other;
run;

How many rows will be in the Other table?

a) 0 rows
b) 123 rows
c) 305 rows
d) 428 rows

A

d) 428 rows

19
Q

Which statement is false?

a) The KEEP statement names the columns to include from the input table.
b) The DROP statement names the columns to exclude from the output table.
c) The KEEP= option in the DATA statement names the columns to include in the output table.
d) The DROP= option in the SET statement names the columns to exclude from being read into the PDV.

A

a) The KEEP statement names the columns to include from the input table.

The KEEP statement controls which columns are in the output table.

20
Q

Which columns are in the final table work.boots?

data work.boots (drop=Product);
set sashelp.shoes (keep=Product Subsidiary Sales
Inventory);
where Product=’Boot’;
drop Sales Inventory;
Total=sum(Sales, Inventory);
run;

a) Subsidiary
b) Subsidiary and Total
c) Product and Subsidiary
d) Product, Subsidiary, Sales, and Inventory

A

b) Subsidiary and Total

The column Subsidiary from the input table and the calculated column Total are in the final table. Product, Sales, and Inventory are dropped.

21
Q

What is the result of running the following DATA step?
data work.boots;
set sashelp.shoes (keep=Product Subsidiary);
where Product=’Boot’;
NewSales=Sales*1.25;
run;

a) The step produces work.boots with three columns.
b) The step produces work.boots with four columns.
c) The step produces an error due to invalid syntax for the KEEP= option.
d) The step produces an error because the Sales column is not read in from the sashelp.shoes table.

A

b) The step produces work.boots with four columns.

The table work.boots is created with the columns of Product, Subsidiary, NewSales, and Sales. The values of NewSales and Sales are missing. Sales is uninitialized because the value was not read in from the input table.