Q: What is the difference between UNION and UNIONALL Operator in SAS?

Information From Dataset: A Information From Dataset: B The output of the UNION Operator UNION Operator is used to club the specific variable information from both the tables. Apart from updating the information, it also removes the duplicate data and sorts the combined information. The output of UNIONALL Operator UNIONALL Operator works by appending the information from dataset B after the information from dataset A. It doesn’t remove the duplicates or sorts the resultant data. Note: If the variables are existing in both the tables, but the order of the same is different. SAS throws the below error – ERROR: Column 1 from the first contributor of UNION is not the same type as its counterpart from the second. ERROR: Column 2 from the first contributor of UNION is not the same type as its counterpart from the second.

Q: Calculate the rank according to the marks in ascending order and if any marks value is repeated the rank should be the same for both. Dataset Rno Name Marks 1 A 50 2 B 49 3 C 51 4 D 52 5 E 52 Output Rno Name Marks Rank 2 B 49 1 1 A 50 2 3 C 51 3 4 D 52 4 5 E 52 4

Data test; input id sal; cards ; 1 300 1 200 3 900 2 300 2 400 3 700 2 100 3 800 ; run ; proc sort data =test out =test1; by id sal; run ; proc sql ; create table test2(drop=number) as select *, monotonic() as number from test1 group by id having number=max(number); quit ;

Q: Fetch 3 variables: height, weight, maxval(max of height or weight) with format=3. Dataset Height Weight 69 112.5 70 84 89.5 78 65.3 97.5 57.3 88 Output Height Weight Maxval 69 112.5 113 70 84 84 89.5 78 90 65.3 97.5 98 57.3 88 88

data test; input name$ weight; cards ; a 25 b 30 c 35 d 40 ; run ; proc sql ; select name, weight from test union all select 'Total' , sum(weight) from test; quit ;

Q: What will be the output of the code below?

data Test; input @1name $7. @9 sex $1. @10 marks; cards; Anshukr M123 MohitaS F32 ; run; proc print data=test; run; Answer Obs name sex marks 1 Anshukr M 123 2 MohitaS F 32

Question 1

Find the number of patients with daig code (D45) and calculate the age of each patient between 01st Jan 2018 to 31 March 2019 from the table given below.

Table Name – Clinical

OBS	IDs	YOB	Date	diag1	daig2	daig3	daig4	daig5
1	100	2-Mar-86	2-Mar-18	D45	0	0	D45	0
2	100	2-Mar-86	5-Mar-18	0	D45	0	0	D45
3	100	2-Mar-86	8-Mar-18	0	C50	0	C50	D45
4	101	5-Jan-80	8-Jan-18	0	0	D45	D45	0
5	101	5-Jan-80	2-Feb-18	D45	0	0	D45	0
6	101	5-Jan-80	27-Feb-18	0	D45	0	0	D45
7	101	5-Jan-80	24-Mar-18	D45	0	D45	0	0
8	101	5-Jan-80	18-Apr-18	D45	0	0	D45	0
9	102	4-Jan-85	18-Jan-17	0	D45	0	0	D45
10	102	4-Jan-85	25-Jan-18	D45	0	0	D45	0
11	102	4-Jan-85	1-Feb-18	0	D45	0	0	D45
12	102	4-Jan-85	8-Jan-18	0	0	D45	D45	0
13	102	4-Jan-85	15-Feb-19	0	0	D45	D45	0
14	103	16-Jan-90	15-Feb-18	0	D45	0	0	D45
15	103	16-Jan-90	25-Feb-18	D45	0	0	D45	0
16	103	16-Jan-90	7-Mar-18	0	D45	0	0	D45
17	103	16-Jan-90	17-Mar-18	0	0	D45	D45	0
18	104	25-Jan-92	27-Feb-18	0	0	D45	D45	0
19	104	25-Jan-92	24-Mar-18	0	D45	0	0	D45
20	104	25-Jan-92	18-Apr-18	D45	0	0	D45	0

Accepted Answer

Suppose we have the above clinical table in the form of SAS Dataset in lib – ‘mylib’. Considering the fact, the following code will give the desired results.

Data Clinical_Analysis;
Set mylib.clinical;
If '01JAN2018'D <= Data <= '31MAR2019'D ;
array diag(*) $diag1 - diag5;
do i=1 to 5;
if diag(i) in ('D45') then f_diag =1;
end;
run;
proc sort data = Clinical_Analysis; by IDs date; run;
Data diag_pat (keep = IDs YOB Date f_diag first_date diag age);
set Clinical_Analysis;
by Ids;
format first_date date9.;
retain diag first_date age;
if first.Ids then do;
diag=0;
first_date =date;
age =0;
end;
age = Year(first_date) - yob;
if f_diag =1 then diag=1;
if last.Ids;
run;
Proc freq data = diag_pat;
title "All required counts";
tables diag age/missing list;
run;

output:

Dataset of the code – Table: diag_pat

OBS	IDs	YOB	Date	f_diag	first_date	diag	Age
1	100	2-Mar-86	8-Mar-18	1	2-Mar-18	1	32
4	101	5-Jan-80	18-Apr-18	1	8-Jan-18	1	38
9	102	4-Jan-85	15-Feb-19	1	18-Jan-17	1	34
14	103	16-Jan-90	17-Mar-18	1	15-Feb-18	1	28
18	104	25-Jan-92	18-Apr-18	1	27-Feb-18	1	26

Question 2

How can we create an index?

Accepted Answer

This is a frequently asked question in SAS interview questions for experienced.

Format for creating a simple index -

Data  (INDEX = ())
Creating a single index in one go
Example : data a(index=(seq /unique /nomiss));
                               Set a;
  Run;
Creating multiple indexes in one go
Example: data a(index =(seq /unique /nomiss
                                                      State /unique
                                                       Year /unique /nomiss)
                              Set a;
 Run;
Format for creating a composite index-
Data  (INDEX = (index-name = ))
              Creating a composite index –
Example – data a(index=(quarter_month = (quarter month) /nomiss)
                               Set a;
                               Run;

Note: We can also use the proc’s to create the indexes in the newly generated output datasets.

Additionally, we could also use the proc datasets procedure and proc SQL to generate indexes.

Question 3

What is the difference between UNION and UNIONALL Operator in SAS?

Accepted Answer

Information From Dataset: A

UNIONALL Operator in SAS

Information From Dataset: B

UNIONALL Operator in SAS

The output of the UNION Operator

UNION Operator is used to club the specific variable information from both the tables. Apart from updating the information, it also removes the duplicate data and sorts the combined information.

output of the UNION Operator

The output of UNIONALL Operator

UNIONALL Operator works by appending the information from dataset B after the information from dataset A. It doesn’t remove the duplicates or sorts the resultant data.

output of the UNION Operator

Note: If the variables are existing in both the tables, but the order of the same is different. SAS throws the below error –

ERROR: Column 1 from the first contributor of UNION is not the same type as its counterpart from the second.
ERROR: Column 2 from the first contributor of UNION is not the same type as its counterpart from the second.

Question 4

What are the different ways to create a table in proc SQL?

Accepted Answer

Expect to come across this popular question in SAS logical interview questions.

There are multiple ways to create table in proc SQL:

I. Create Table From Column Definition

We can create a new table without rows by using the CREATE TABLE statement to define the columns and attributes. We can specify a column’s name, type, length, informat, format and label.

procsql;
createtable employee(
ssn char(9),
dob numformat = DATE9.informat = DATE9.label = "Date Of Birth",
doh numformat = DATE9.informat = DATE9.label = "Date Of Hire",
gender char(1) label = "Gender",
estat char(10) label = "Employement Status");
insertinto employee values ("111111111",500,25000,"M","ACTIVE");
quit;
procsql;
select * from employee;
quit;

Create Table From Column Definition

II. Create Table from a Query Result

Many times, there are situations wherein we already have a structure of the table/view and we are trying to create a mirror table that is referring to columns of the earlier existing table. In such cases, we can use the same CREATE TABLE Statement and place it before the SELECT Statement. An additional benefit of using this approach of creating the table is that the data of the new table is derived on the basis of existing data from the parent table.

procsql;
createtable employee1 as
select * from employee
quit;
procsql;
select * from employee1;
quit;

Create Table from a Query Result

III. Create Table Like an Existing Table

To create an empty table that has the same columns and attributes as an existing table, we use the LIKE clause while creating the table. One point to note is that an empty table is created using this method and no data is populated inside the table.

procsql;

createtable employee3 like employee;

quit;

Question 5

Write code to connect the Teradata.

Accepted Answer

A must-know for anyone heading into a SAS interview, this question is frequently asked in tricky SAS interview questions.

Connecting to teradata using pass through

Procsql;
Connect to Teradata( USER= “ &user_id.” PASSWORD =” &dbpass”
tdpid =rchgrp2 MODE= ‘teardata’);
Create table work. Account_details as(Select
Account_ Number, Enroll_ Date, Sales, City
From Master_ acct_ details
Where Account_ Number is not null and Sales gt 40000
);
Disconnect from Teradata;
Quit;
Connecting to db2 using pass through:
Procsql;
Connect to db2(  DATABASE=CRD_ PROD USER= “ &user_id.”
PASSWORD=” &dbpass”);
Create table work. Account_details as(Select
Account_ Number, Enroll_ Date, Sales, City
From Master_ acct_ details
Where Account_ Number is not null and Sales gt 40000
);
Disconnect from db2;
Quit;

Other way using the library assignments:

Libname AM DB2 DATABASE= CARD_USA Schema=DDKAUTO

USER=”&user_ id.” PASSWORD= “&dbpass”;
Procsql;

Create table work. Account_details as(Select Account_ Number, Enroll_ Date, Sales, City From AM.Master_ acct_ details Where Account_ Number is not null and Sales gt 40000);

Question 6

Calculate the rank according to the marks in ascending order and if any marks value is repeated the rank should be the same for both.

Dataset

Rno	Name	Marks
1	A	50
2	B	49
3	C	51
4	D	52
5	E	52

Output

Rno	Name	Marks	Rank
2	B	49	1
1	A	50	2
3	C	51	3
4	D	52	4
5	E	52	4

Accepted Answer

Data test;
input id sal;
cards;
1 300
1 200
3 900
2 300
2 400
3 700
2 100
3 800
;
run;
proc sort data =test out=test1;
by id sal;
run;
proc sql;
create table test2(drop=number) as
select *, monotonic() as number from test1
group by id
having number=max(number);
quit;

Question 7

Fetch 3 variables: height, weight, maxval(max of height or weight) with format=3.
Dataset

Height	Weight
69	112.5
70	84
89.5	78
65.3	97.5
57.3	88

Output

Height	Weight	Maxval
69	112.5	113
70	84	84
89.5	78	90
65.3	97.5	98
57.3	88	88

Accepted Answer

data test;
input name$ weight;
cards;
a 25
b 30
c 35
d 40
;
run;
proc sql;
select name, weight
from test
union all
select 'Total', sum(weight)
from test;
quit;

Question 8

What is PAD option and how it is used?

Accepted Answer

The PAD option can be used to read the source data having the variable length records, so to read the variable length of records we can use the PAD option, the PAD option can be used with the infile

statement;

example

data Test;
infile 'C:\Users\Anshu\Desktop	est.txt' PAD;
input @1name $6. @10 sex $1. @12 marks 3. @14 salary 8.;
run;
proc print data=Test;
run;

Question 9

What is Macro and how it can be used?

Accepted Answer

It's no surprise that this one pops up often in SAS Macro interview questions.

Macro is a way of automating in SAS programs. There are many ways to create macro.

%macro get_var_dist (data_nm , categ_cut_off);
/* Taking list of variables in the dataset along with type*/
proc contents data=&data_nm. out=_content_;
run;
/* Note - how data_nm is getting resolved. */

/* Fiding number of variables to define loop */
proc sql noprint;
select count(*) into: nvars from _content_;
quit;
/* Note this is another way of defining macro variables */
%put "No of Variables is  " &nvars. ;
/* Note this can be a useful way to detect the value of macro variables.*/
%do i=1 %to &nvars. ;
data _null_;
set _content_;
if _n_ = &i. then do;
       call symput("variable", name);
       call symput("var_type",type);
end;
run;
/* Note -
  1: One more way of defining variables.
  Please note here information is being passed from general SAS dataset
  to SAS macros using call symput
  2: Please see how to run do a loop
*/
%put "Variable is  " &variable. ;
%put "Variable Type is  " &var_type. ;
proc sql noprint;
       select count(distinct &variable) into: distinct_categories
       from &data_nm.;
       quit;
%put "Number of distinct categories is  &distinct_categories." ;
/* Note the syntax of writing if else condition */
/* %do ....  %end; defines the block */
/* Missing command below includes missing value as a category */
/* Also note, if you don't use %eval it can throw errors */
/* if comparison involves floating numbers, you will need to use %sasevalf */
%if %eval(&distinct_categories.) <=  %eval(&categ_cut_off.) %then %do;
Title "Freq Dist for Variable &variable. as number of distinct
category is &distinct_categories.";  
proc freq data=&data_nm. ;
table &variable. / missing ;
run;
%end;
%if %eval(&var_type.) =  1 %then %do;
Title "Univariate Dist for Variable = &variable. ";  
proc univariate data=&data_nm. plot;
var &variable.  ;
run;
%end;

%end;
%mend get_var_dist;
options symbolgen mlogic mprint;
%get_var_dist(data_nm=sashelp.heart, categ_cut_off=25);
%get_var_dist(data_nm=sashelp.asscmgr, categ_cut_off=10);

Question 10

How does SAS Processing done explain briefly?

Accepted Answer

The SAS processing is in two steps

Compilation phase
Execution phase

Compilation phase
- If there is a raw data to be read, there is a creation of input buffer a logical concept for holding the data, the input buffer is not created while reading a dataset.
- After the input buffer, the pdv is created, it is the area of memory where SAS builds its dataset.
- Creating two automatic variables 1. _n_ (takes the value of a number of iterations) and 2. _error_ (takes the value 1 or 0, 1 if errors and 0 if no errors)
- The SAS then do the syntax checking for the data steps, that includes misspelled keywords, invalid varnames, missing punctuation, and invalid option.
- Data set variables are created as the input statement is encountered and their length and type are determined.
- The descriptor portion is now written and this is the final step in the compilation phase, the run statement is the signal for the end of compilation phase. At this point, the data set contain the variables mentioned in the input statement, but the values would only be written in the execution phase.
Execution phase
- The automatic variables are initialized _n_ takes the value 1, _error_ to 0 and the remaining variables to missing (.)
- The input pointer now reads the data values and assigns the variable values to the pdv.
- At the end of the data step the data values from the pdv are written to the dataset and then _n_ takes the value as 2, _error_ 0 and all rest as missing(.).

Question 11

what is the difference between @@ and @?

Accepted Answer

The use of double trailing is that, if you have datalines in one line then you can use @@;

data test1;
input name $ age marks @@;
datalines;
am 12 95 ra 13 67 pa 14 98 sa 14 87
;
run;
proc print data=test1;
run;

Output

Rno	Name	Marks
1	A	50
2	B	49
3	C	51
4	D	52
5	E	52

The use of single @ is in the holding the line, it basically is used in more than one input statement, the pointer basically hold the line after reading the values, it basically holds the value and checks the condition;

data test2;
input @1 gender $1. @;
if gender ne 'F' then delete;
input @3 age @5 marks;
datalines;
F 13 56
M 12 78
F 13 78
M 56 90
;
run;
proc print data=test2;
run;

Output

Obs	Gender	age	Marks
1	F	13	56
2	F	13	78

Question 12

What will be the output of the code below?

Accepted Answer

data Test;
input @1name $7. @9 sex $1. @10 marks;
cards;
Anshukr M123
MohitaS F32
;
run;
proc print data=test;
run;

Answer

Height	Weight
69	112.5
70	84
89.5	78
65.3	97.5
57.3	88

Question 13

What is PAD option and how it is used?

Accepted Answer

The PAD option can be used to read the source data having the variable length records, so to read the variable length of records we can use the PAD option, the PAD option can be used with the infile

statement;

example

data Test;
infile 'C:\Users\Anshu\Desktop	est.txt' PAD;
input @1name $6. @10 sex $1. @12 marks 3. @14 salary 8.;
run;
proc print data=Test;
run;

Question 14

For what purpose would you use the RETAIN statement?

Accepted Answer

A common question in SAS scenario based interview questions, don't miss this one.

RETAIN statement causes a variable that is defined during the input statement to be retained across data set iterations, in other words, retain statement retains the variable value at each iteration and keeps on substituting it, the pdv does not get cleared when you use the retain statement, let’s try to understand the basic example explained below –

Table - mylib.clinical

OBS	IDs	YOB	Date	diag1	daig2	daig3	daig4	daig5
1	100	2-Mar-86	2-Mar-18	D45	0	0	D45	0
2	100	2-Mar-86	5-Mar-18	0	D45	0	0	D45
3	100	2-Mar-86	8-Mar-18	0	C50		C50	D45
4	101	5-Jan-80	8-Jan-18	0	0	D45	D45	0
5	101	5-Jan-80	2-Feb-18	D45	0	0	D45	0
6	101	5-Jan-80	27-Feb-18	0	D45	0	0	D45
7	101	5-Jan-80	24-Mar-18	D45	0	D45	0	0
8	101	5-Jan-80	18-Apr-18	D45	0	0	D45	0

Suppose we have the above clinical table in the form of SAS Dataset in lib – ‘mylib’. Considering the fact the following code will give the desired results.

Data Clinical_Analysis;
Set mylib.clinical;
If '01JAN2018'D <= Date <= '31MAR2019'D ;
array diag(*) $diag1 - diag5;
do i=1 to 5;
if diag(i) in ('D45') then f_diag =1;
end;
run;
proc sort data = Clinical_Analysis; by IDs date; run;
Data diag_pat (keep = IDs YOB Date f_diag first_date diag age);
set Clinical_Analysis;
by Ids;
format first_date date9.;

retain diag first_date age; /*retaining diag first_date and age at every iteration */

if first. Ids then do;
diag=0;
first_date =date;
age =0;
end;
age = Year(first_date) - yob;
if f_diag =1 then diag=1;
if last.Ids;
run;
Proc freq data = diag_pat;
title "All required counts";
tables diag age/missing list;
run;

output:

Dataset of the code – Table: diag_pat

OBS	IDs	YOB	Date	f_diag	first_date	diag	Age
1	100	2-Mar-86	8-Mar-18	1	2-Mar-18	1	32
4	101	5-Jan-80	18-Apr-18	1	8-Jan-18	1	38

Question 15

What is the difference between calculating the 'mean' using the mean function and PROC MEANS?

Accepted Answer

The mean() function is used to calculate the mean of one observation across the variable. It is used to generate a horizontal data summary.

DATA A;
INPUT NAME$ SUBJ1 SUBJ2 SUBJ3;
CARDS;
A 95 96 99
B 85 90 70
C 60 70 80
;
RUN;
DATA B;
SET A;
MEAN = MEAN(SUBJ1,SUBJ2,SUBJ3);
RUN;
PROCPRINT;
RUN;

Proc Means is used to calculate the mean of a variable across all observations.

It is used for generating the vertical data summary (i.e. mode of operation is column-wise).

SAS System

By default, it calculates N, Mean, Std Dev, Minimum, Maximum statistics.

Means procedure

Question 16

How will clean the data having multiple delimited values of a variable in SAS like if data values are “ab,c,*&%$ ~d” and you want to make it like “abcd”?

Accepted Answer

In case you want to delete multiple delimiters in one go, you need to put all delimiters in the second argument in any order and use the SAS function compress.

data clean; 
x='ab,c,*&%$ ~d';
 y=compress(x,'~$%&* ,'); run;
proc print data=clean;
run;

Output –

Obs  x    y
1  ab,c,*&%$ ~d    abcd

Question 17

How agecat classifications created?

Accepted Answer

It's no surprise that this one pops up often in advanced SAS interview questions.

Numeric Format ag’ is created, in the work library (is a temporary format) with the specified ranges:

0 to less than 30 coded as ‘young’
30 to less than 61 coded as ‘old’
Format ‘ag’ is called & applied with ‘age’

 Proc format lib=work;
Value ag
0-<30=’young’
30-<61=’old’;
Run;
Proc print data=sasuser.admit;
Format age ag.;

Question 18

How to connect database?

Accepted Answer

Procsql;
Connect to Teradata( USER= “ &user_id.” PASSWORD =” &dbpass”
tdpid =rchgrp2 MODE= ‘teardata’);

Create table work. Account_details as (Select Account_ Number, Enroll_ Date, Sales , City From Master_ acct_ details Where Account_ Number is not null and Sales gt 40000);

Disconnect from Teradata;
Quit;

Question 19

Write a macro that compares the creation date of all the datasets in two libraries and report in RTF format the libname, dataset name, created date and action for a user for both libraries the sasuser and the work?

Accepted Answer

The options nofmterr mprint mlogic symbolgen;

%macro datecompare;
proc format;
value $bckcolor
'Dates match No action'='green'
 'PLZ CHECK'=red;
run;
%macro a;
%do i=1 %to 5;
data sasuser.a&i;
set sasuser.admit;
run;
proc copy in =sasuser out=work memtype=data;
select a&i;
run;
%end;
%mend a;
%a;
proc sql noprint;
select count(distinct memname) into:cntwork from dictionary.columns where upcase(libname)
='WORK';
quit;
%put &cntwork;
proc sql noprint;
select count(distinct memname) into: cntuser from dictionary.columns where upcase(libname)
='SASUSER' and upcase(memname) in ('A1','A2','A3','A4','A5');
quit;
%put &cntuser;
proc sql;
create table wrk as select libname,memname,crdate from dictionary.tables where upcase(libname)
='WORK' and upcase(memname) in ('A1','A2','A3','A4','A5');
quit;
proc sql;
create table suser as select libname,memname,crdate from dictionary.tables where upcase(libname)
='SASUSER' and  upcase(memname) in ('A1','A2','A3','A4','A5');
quit;
proc sort data=wrk (rename=(crdate=crdate1 libname=libname1));
by memname;
run;
proc sort data=suser;
by memname;
run;
data final;
merge wrk(in=a) suser(in=b);
by memname;
if a and b;
run;
data new;
set final;
if left(trim(crdate))=left(trim(crdate1)) then action='Dates match No action';
else action='PLZ CHECK';
run;
proc sql;
create table new as select memname,libname,crdate,libname1,crdate1,action from new;
quit;
ods listing close;
ods rtf file='\path 
ew.rtf';
proc report data=new nowd headline headskip split='*';
Title1 j = center font = arial h = 4 color = blue "Creation Date comparision for";
Title2 j = center font = arial h = 4 color = blue "datasets of library sasuser and work";
columns memname libname crdate libname1 crdate1 action;
define memname/display width=9 spacing=3 center;
define libname/display width=2 spacing=3 center;
define crdate/display width=2 spacing=3 center;
define libname1/display width=3 spacing=3 center;
define crdate1/display width=9 spacing=3 center;
define action/display width=9 spacing=3 center style ={background =$bckcolor.};;
run;
ods rtf close;
ods listing;
%mend datecompare;
%datecompare;

Question 20

How to reverse a particular text in sas?

Accepted Answer

The tranwrd function helps in the replacement of a string in a char variable. We can reverse the text/values in SAS using tranward function. It has many functions like reversing, replacing etc.

For example –

Converting the multiple occurence of a string;
data x; input name$; datalines; highhigh high cc ; run;
data a; set x; name=tranwrd(name,'high','hi'); run;
proc print data=a; run;

output –

Obs name
1 hihi
2 hi
3 cc

Question 21

What are the different ways to handle a case in sas?

Accepted Answer

In SAS, we can convert text into any case like LOWCASE, UPCASE and PROCASE.

LOWCASE – It converts the string into small letters or in lower case.

data lower; 
x='SAS'; 
y=lowcase(x); 
x=lowcase('CHAMP');
 run;
proc print data=lower; run;

output –

Obs     x   y
1       champ sas

UPCASE – It converts the string into capital letters or in upper case.

data uper; 
x='sAs'; 
y=upcase(x); 
x=upcase('ChaMP');
 run;
proc print data=uper; run;

output –

Obs     x   y
1       CHAMP SAS

PROCASE – It converts the string into proper letters or in other words we can say the first letter upper/upcase and rest in lowercase. Example-

data procase; 
x='sAs'; 
y=procase(x); 
x=procase('ChaMP');
 run;
proc print data=procase; run;

output –

Obs     x   y
1       Champ Sas

Question 22

Write a macro that shows the use of symget in an if condition.

Accepted Answer

Let suppose we have a table admit in sasuser library and we have to fetch the student who has age greater than the average age then this can be code like the one below using symget.

%macro ch; 
proc sql;                                                                                                                    select avg(age) into:avg_age from sasuser.admit;                                                                              quit;                                                                                                                          data x;                                                                                                                        set sasuser.admit;                                                                                                            ageif=symget('avg_age');                                                                                                      
if age gt ageif;                                                                                                              run;                                                                                                                          %mend ch;                                                                                                                      %ch;

Question 23

How to create a numeric date if month Year and day is given through SAS?

Accepted Answer

The mdy function creates a numeric date from the values of the month day and year.

data date_data; 
input name $ month year day; 
datalines;
 ram 10 1981 13 
Geeta 04 1982 20 ; 
run;
data date_date1; 
set date_data;
 attrib bdy format=date9.; 
bdy=mdy(month,day,year); 
run;
proc print data= date_date1; run;

Output –

Obs name  month year   day bdy
1     ram 10      1981 13 13OCT1981
2    geeta  4 1982    20 20APR1982

Question 24

How to concatenate multiple variables in a single variable through SAS?

Accepted Answer

we can use CATX function for it. CATX function with a series, Let’s say you want to concatenate the value of a variable;

data a; 
sp='|'; 
x1='a';
 x2='b'; 
x3='c'; 
string=catx(sp,of x1-x3); 
run; 
proc print data=a; run;

output –

Obs     sp     x1 x2         x3 string
1           |   a b           c a|b|c

Question 25

What is N function in sas?

Accepted Answer

A common question in SAS interview questions for experienced, don't miss this one.

The N function counts the number of non-missing values in a row

data a; 
x1=2; 
x2=3; 
x3=4;
nvars_nonmiss=n(of x1-x3); 
put nvars_nonmiss=; 
run;
proc print data = a; run;

Output –

Obs    x1 x2    x3 nvars_nonmiss
        1       2 3     4 3

Question 26

How to count missing value in SAS?

Accepted Answer

One of the most frequently posed SAS logical interview questions, be ready for it.

We can count missing value in sas using NMISS function.

For example –

data a; x1=2; x2=3; x3=.; 
nvars_miss=nmiss(of x1-x3); 
put nvars_miss=;
 run;
proc print data = a; 
run;

Question 27

What is reverse function in sas?

Accepted Answer

The reverse function just reverse the string, if there are leading blanks they become trailing.

data a; 
x='abcdef'; 
change=reverse(x); 
put change;
 run;

Output –

fedcba

Question 28

What is intnx function and how powerful it is?

Accepted Answer

The INTNX function increments day, year or month specified a date by the parameters given. It is a very helpful function present in sas. Few examples below can show you how -

Here it increments the date by 6 weeks, means date after 6 weeks including current week.

data _null_; 
x=intnx('week', '23mar2012'd, 2); 
put x date9.; 
run; 
output - 01APR2012 
The intnx function increments the day, year or month on the specified date, you can use put as well; 
data amit; 
x='13oct1981'd; 
z=put(intnx('month',x,3,'e'),date9.); 
run; 
proc print data=amit; 
run;

output -

Obs x z

1 7956 31JAN1982

The intnx function increments the day, year or month on the specified date, you can use the same argument for incrementing 1 month from the date;

data amit; 
x='13oct1981'd; 
z=put(intnx('month',x,1,'same'),date9.); 
put z=; 
run;
Output - z=13NOV1981

Question 29

What is a monotonic function and how useful is it?

Accepted Answer

The MONOTONIC function is quite similar to the internal variable _N_ in the DATA Step. We can use it to select the records according to their row number. For example, we choose the SSNs from the 501st line to the 888th line in the SSN dataset.

 proc sql;

select * from ssn_data where monotonic() between 501 and 800 ;

quit;

you can also the monotonic() to subset the dataset in the where clause to limit the number of records read/outputted to the final dataset…

view source

print?

proc sql;
create table class2 as
select monotonic() as rowno, *
from sashelp.class
where 10 le monotonic() le 20;
quit;

Question 30

What are the different joins in proc SQL and unlike data step why SQL don’t require sorting before joins?

Features of SQL join

Needs a Primary key
Primary key variable and any other variable can be different (doesn’t require to be renamed like in Base SAS)
Doesn’t require sorting
Does not overwrite common variables between tables

Why SQL doesn’t require sorting?

Accepted Answer

Because of Cartesian product - Cartesian product result-set contains the number of rows in the first table, multiplied by the number of rows in the second table.

Question 31

Code a PROC MEANS that shows both summed and averaged output of the data.

Accepted Answer

data a;
input name $ 1. marks;
cards;
a 100
b 200
c 300
;
run;
procmeansdata = a summean;
run;

Output

Question 32

What is the result for the below code?

data new_emp;
  drop i;
  do i=1 to 5;
     input name $ idno status $;
        /* return to top of loop  */
        /* when condition is true */
     if status='PT' then continue;
        input benefits $10.;
        output;
  end;
  datalines;
Ansh 9011 PT
Tushar 876 PT
Richa 1002 FT
Eye/Dental
Kelly 85111 PT
Simmi 433 FT
HMO
;
run;
proc print data = new_emp; run;

Accepted Answer

One of the most frequently posed SAS interview questions, be ready for it.

In this code, we are giving a data or you can say creating a table using ‘datalines’ but printing the rows which only have status as ‘PT’. We have used loop here to check each row and thus printing the desired results.

Obs	Name	idno	status	Benefits
1	Richa	1002	FT	Eye/Dental
2	Simmi	433	FT	HMO

Question 33

what will be the output of the code below

data a;
a=100000;
i=0.0925;
year=0;
do until (year=5);
a=a+a*i;
i=i+i*0.10;
year+1;
output;
end;
run;
proc print data = a; run;

Accepted Answer

Here in this code we have A (Amount) = 100000 and I (interest) = 0.0925. We need to find Amount and updated interest till five years also the interest increases by 10% every year.

Obs	A	i	Year
1	109250.00	0.10175	1
2	120366.19	0.11193	2
3	133838.17	0.12312	3
4	150315.99	0.13543	4
5	170673.18	0.14897	5

Question 34

what will be the output of the code below?

data Test;
input name$ age dob date9.;
datalines;
am 67 13oct1981
pr 66 10apr1982
;
run;
proc print data= Test;
format dob worddate14.;
run;

Accepted Answer

This code is only an illustration to create a table declares variables name, age, and dob with formats and taking values using datalines.

input name$ age dob date9.; this statement basically creating variables and giving format to dob as date9.

OBS	IDs	YOB	Date	diag1	daig2	daig3	daig4	daig5
1	100	2-Mar-86	2-Mar-18	D45	0	0	D45	0
2	100	2-Mar-86	5-Mar-18	0	D45	0	0	D45
3	100	2-Mar-86	8-Mar-18	0	C50	0	C50	D45
4	101	5-Jan-80	8-Jan-18	0	0	D45	D45	0
5	101	5-Jan-80	2-Feb-18	D45	0	0	D45	0
6	101	5-Jan-80	27-Feb-18	0	D45	0	0	D45
7	101	5-Jan-80	24-Mar-18	D45	0	D45	0	0
8	101	5-Jan-80	18-Apr-18	D45	0	0	D45	0
9	102	4-Jan-85	18-Jan-17	0	D45	0	0	D45
10	102	4-Jan-85	25-Jan-18	D45	0	0	D45	0
11	102	4-Jan-85	1-Feb-18	0	D45	0	0	D45
12	102	4-Jan-85	8-Jan-18	0	0	D45	D45	0
13	102	4-Jan-85	15-Feb-19	0	0	D45	D45	0
14	103	16-Jan-90	15-Feb-18	0	D45	0	0	D45
15	103	16-Jan-90	25-Feb-18	D45	0	0	D45	0
16	103	16-Jan-90	7-Mar-18	0	D45	0	0	D45
17	103	16-Jan-90	17-Mar-18	0	0	D45	D45	0
18	104	25-Jan-92	27-Feb-18	0	0	D45	D45	0
19	104	25-Jan-92	24-Mar-18	0	D45	0	0	D45
20	104	25-Jan-92	18-Apr-18	D45	0	0	D45	0

Question 35

What will be the output?

data test;
array item(j) $ 12 x1-x12;
do j=1 to 12;
put item;
end;
run;
proc print data = test; run;

Accepted Answer

A staple in SAS logical interview questions, be prepared to answer this one.

Obs	J	x1	x2	x3	x4	x5	x6	x7	x8	x9	x10	x11	x12
1	13

Question 36

How you can clear any library?

Accepted Answer

This question is a regular feature in tricky SAS interview questions, be ready to tackle it.

We can clear the libref through one of the below approaches:-

Using Libname Statement:-

We can use the below syntax to disassociate a specific library

Syntax : LIBNAME libref

Example : Libname temp_lib clear where temp_lib is the libref that you want to deassign If, we would like to disassociate all the libref that we have declared in a SAS Session, then use the below-modified version

 LIBNAME _ALL_ 
        Example: Libname _ALL_ clear

Using the Libname Function : -

We can use the LIBNAME function to deassign a library reference. In case, a warning or an error occurs then a message is written to the SAS log.

   %if (%sysfunc(libname(temp_lib))) %then
  %put %sysfunc(sysmsg());

where temp_lib is the libref that you want to deassign.

Note:- In both the above approaches libref is cleared if and only if there are no open files that are associated from that libref.

Using SAS Explorer Window:-

On the Explorer window go to libraries. Right click on the library that you want to design and select delete. Once you click on Delete you will receive a Delete confirmation with the text "Are you sure you want to remove the library reference named temp_lib?". Once you click Ok, the same is going to get deleted.

Question 37

what will be the output of the code below

data test;                                                                                                                            
do while (i< 5);                                                                                        
x+1;                                   
i+1;                                                                                                                    
output;                                                                                                                                 
end;                                                                                                                                    
run;
proc print data = test; run;

Accepted Answer

In this code, we are trying to show the characteristics of ‘do while’ loop. For each iteration, it’s increasing the counter and writing in the output dataset. Please refer the output table below -

Obs	A	i	Year
1	109250.00	0.10175	1
2	120366.19	0.11193	2
3	133838.17	0.12312	3
4	150315.99	0.13543	4
5	170673.18	0.14897	5

Question 38

What is the output of the code?

data a;
input id val;
datalines;
1 5
;
run;
data b;
input id na;
datalines;
1 478
1 300
1 523
1 326
1 547
2 35
3 58
;
run;
data ab;
merge a(in=a) b(in=b);
by id;
if a;
run;
proc print data=ab noobs;
run;

Accepted Answer

id	val	Na
1	5	478
1	5	300
1	5	523
1	5	326
1	5	547

Question 39

what is the output of the entire code below? What kind of merge is this?

data test;                                                                                                                        
input pt visit ae $; 
datalines;  
001 1 headache                                                                                                                          
001 2 kill                                                                                                                              
001 3 pain                                                                                                                              
001 4 knee                                                                                                                              
002 1 tick
003 2 flee                                                                                                                              
;                                                                                                                                       
run;                                                                                                                                    
data test1;                                                                                                                             
input pt visit ae $;                                                                                                                    
datalines;                                                                                                                              
001 5 death  
001 6 vomit                                                                                                                           
002 2 hitch
003 1 ike                                                                                                                               
;                                                                                                                                       
proc sort data= test out= test;                                                                                                         
by pt visit;                                                                                                                                  
run;                                                                                                                                    
proc sort data=test1 out=test1;                                                                                                       
by pt visit;                                                                                                                                  
run;                                                                                                                                    
data merged;
merge test(in=a) test1(in=b);
by pt visit;
if a;
run;
proc print data=merged;
run;

Accepted Answer

Output of the code

Obs	A	i	Year
1	109250.00	0.10175	1
2	120366.19	0.11193	2
3	133838.17	0.12312	3
4	150315.99	0.13543	4
5	170673.18	0.14897	5

Question 40

What is the difference between SET and Merge?

Accepted Answer

SET and Merge are two very vital statements in SAS. Let us understand by taking 2 tables ‘one’ and ‘two’ and perform SET and Merge Operation on them.

Obs	A	i	Year
1	109250.00	0.10175	1
2	120366.19	0.11193	2
3	133838.17	0.12312	3
4	150315.99	0.13543	4
5	170673.18	0.14897	5

Code -

DATA two;

INPUT id v3 v4;

DATALINES;

1 1000 10000

2 1500 15000

3 2000 20000

4 800 30000

;

PROC SORT Data=two;

BY id;

RUN;

Output

Obs	id	v3	v4
1	1	1000	10000
2	2	1500	15000
3	3	2000	20000
4	4	800	30000

The SET statement is the main tool to read observations from a SAS data set. The basics are so obvious you might not even consider them. Just SET and a data set name and there you go. Through SET statement the dataset gets appended while with merge the two datasets get merged on the basis of ID and extra variables are created if needed. Example below.

Code -

DATA TestSet;

Set one two;

RUN;

Output

Obs	id	v1	v2	v3	v4
1	1	10	100	.	.
2	2	15	150	.	.
3	3	20	200	.	.
4	1	.	.	1000	10000
5	2	.	.	1500	15000
6	3	.	.	2000	20000
7	4	.	.	800	30000

Code -

DATA TestMerge;

MERGE one two;

BY id;

PROC PRINT DATA= TestMerge; title "TestMerge";

RUN;

Output

Obs	id	v1	v2	v3	v4
1	1	10	100	1000	10000
2	2	15	150	1500	15000
3	3	20	200	2000	20000
4	4	.	.	800	30000

Question 41

How would you code a merge that will keep only the observations that have matches from both sets?

Accepted Answer

data a;
input name $ 1. age;
cards;
a 12
b 15
c 18
;
run;
procprint; run;

data b;
input name $1. marks;
cards;
a 200
c 150
;
run;
procprint; run;

procsortdata = a;
by name;
run;
procsortdata = b;
by name;
run;
data final;
merge a(in=x) b(in=y);
by name;
if x and y thenoutput;
run;
proc print;
run;

Question 42

Does SAS 'Translate' (compile) or does it 'Interpret'? Explain.

Accepted Answer

When we submit the DATA step for execution. SAS checks the syntax of the submitted code and compiles them. During the compile phase, SAS created 3 items:

Input buffer – is a logic memory area into which SAS reads each record of raw data. This buffer is not created in a case wherein the operations are performed on an already existing dataset.
Program data vector (PDV) – a logical memory area wherein sas builds the dataset, one observation at a time. Additionally, SAS also creates 2 automatic variables - _N_ and _ERROR_.
Descriptor information – it includes data set attributes and variable information.

Question 43

What is the effect of the OPTIONS statement ERRORS=1?

Accepted Answer

ERRORS = specifies the maximum number of observations for which SAS issues complete error messages.

Note: - If there are more errors then what is specified in the option then a warning message displays which lets the user know that the limit that was set has been reached.

So, ERROR = 1 will only issue an error

Example :

data a
input name $ 1. age;
cards;
a 20
b 23
procpint; run;
1st Error -

Before setting the options ERRORS = 1

Question 44

What does the RUN statement do?

Accepted Answer

This is a frequently asked question in SAS interview questions.

Answer – the RUN statement is used to define the step boundary, it tells the compiler that the previous set of SAS statements are to be executed. Additionally, sometimes it also helps to make the code easy to read and interpret in the log.

DATA TEST;
INPUT NAME$ SUBJ1 SUBJ2 SUBJ3 GENDER $1.;
CARDS;
A 95 96 99 M
B 85 90 70 F
C 60 70 80 M
;

Question 45

What are _numeric_ and _character_ and what do they do?

Accepted Answer

_numeric_ :- it is used to filter all the numeric type variables from the dataset.
_character_ :- it is used to filter and select all the character type variables from the dataset.

DATA A;
INPUT NAME$ SUBJ1 SUBJ2 SUBJ3 GENDER $1.;
CARDS;
A 95 96 99 M
B 85 90 70 F
C 60 70 80 M
;
RUN;
DATA NUM (KEEP=_NUMERIC_);
SET A;
RUN;
PROCPRINT;
RUN;

DATA CHAR (KEEP=_CHARACTER_);
SET A;
RUN;
PROCPRINT;
RUN;

Question 46

What is “Indirect referencing” and how it can achieve in SAS?

Accepted Answer

Generally, macro variable references shown so far have been direct macro references that begin with one ampersand: &name. However, indirect reference is also possible where macro variables name that belongs to a series is determined when the macro variable reference resolves. The macro facility provides indirect macro variable referencing, which allows you to use an expression (for example, DRUG&N) to generate a reference to one of a series of macro variables. For example, you could use the value of macro variable N to reference a variable in the series of macro variables named DRUG1 to DRUG10. If N has the value 8, the reference would be to DRUG8. If the value of N is 3, the reference would be to DRUG3.

Although for this example the type of reference you want is DRUG&N, the following example will not produce the results that you expect, which is the value of &N appended to DRUG:

%put &drug&n; /* incorrect */

This produces a warning message saying that there is no macro variable DRUG because the macro facility has tried to resolve &DRUG and then &N and concatenate those values.

To force the macro processor to rescan a macro variable reference, you use more than one ampersand in the macro variable reference. When the macro processor encounters multiple ampersands, its basic action is to resolve two ampersands to one ampersand. For example, to append the value of &N to DRUG and then reference the appropriate variable name, you use:

%put &&drug&n; /* correct */

Assuming that &N contains 6 when the macro processor receives this statement, it performs the following steps:

resolves && to &
passes DRUG as text
resolves &N into 6
returns to the beginning of the macro variable reference, &DRUG6, starts resolving from the beginning again, and prints the value of DRUG6.

Thus using multiple ampersands (&) for reference to a macro variable is called indirect referencing and few characteristics of indirect referencing are -

Ampersands (&), are resolved from left to right.
Ampersands (&), are resolved in pair (groups of 2), i.e && &
Number of ampersands required to resolve an indirect referencing on a macro variable is 2^n-1

Question 47

What is the line hold specifier and how would you use them?

Accepted Answer

Line hold specifiers are basically ‘@’ and ‘@@’ where ‘@@’ is used when we have multiple observations per line of raw data. It is used at the end of the INPUT statement. It tells SAS to hold that line of data and continue to read observations until it either runs out of data or reaches an INPUT statement that does not end with a double trailing @. When we use @ only without specifying n, SAS will hold that line of data until it reaches either the end of the DATA step or an INPUT statement that does not end with a trailing @.
Example –

One has a lab testing data file. The ASCII file layout looks like as follows,

test1 11 2
test2 1 2 5
test3 2

Where the first field is the test subject and the rest of the fields are repeated testing results for the subject.

One wants the final file has following format,

test_id count result
test1 1 11
test1 2 2
test2 1 1
test2 2 2
test2 3 5
test3 1 2

The difficult part of this example is that the numbers of tests vary from subjects to subjects. However, if one reads each line with a trailing @, one needs not to know the numbers of tests. The logic is straightforward. First, one reads the first field, and then reads in the rest of the fields and outputs it. Second, one needs to check when a line ends. If it ends, then one begins to read the next line.

 data test1;
 length test_id $8 cnt result 8;
 infile cards eof=eof ;
 input @1 test_id $5. @ ; 
 do while (1);
 cnt+1;
 input result ?? @; 
 if result = '.' then do; 
 cnt=0;
 input @@; 
 leave;
 end;
 else output;
 end;
 return;                           
eof: stop;
cards;
test1 11 2
test2 1 2 5
test3 2
;

“input @1 test_id $5. @ ;”  or “input result ?? @;” hold line position to prevent SAS from going to a new line.
“if result = '.' then do;”   tests if it is the beginning of a new line.
“input @@;” holds the current line position.
It is much simpler if one use double trailing @@ and
a retain statement in this example.
data test1(drop=dummy);
 length test_id dummy $8;
 retain test_id;
 input dummy @@ ;
 if input(dummy, ?? best.) ne . then do;
result = input(dummy, 8.);

output:

end;
else test_id=dummy; 
cards;
test1 11 2 
test2 1 2 5 
test3 2 ;

Question 48

What is PDV? Explain with example and how it works.

Accepted Answer

The Program Data Vector is a logical area of memory that is created during the data step Processing.

PDV is the brain of SAS.
The program data vector contains two types of variables.
Permanent (data set and computed variables)
Temporary (automatic and option defined) Automatic (_N_ and _ERROR_)

- Suppose we have a data set

data a;
input id name$;
cards;
11 A
12 B
;
run;

Step 1:- Compilation
Step 1.1:- syntax checking of the code, syntactical errors is compilation errors.
Step 1.2:- PDV gets created. (Logical concepts where SAS builds data).
he sas statements are checked and then compiled when we submit the data step that means the sas codes getting converted into machine code.
SAS identifies the type and length of each new variable and determines whether a variable type conversion is necessary for each subsequent reference to a variable in this phase.
In the compilation phase, SAS creates three items –
- Input buffer
Logical area where sas :
- Reads each record of data in the input statement.
- This is created only when Data steps read raw data else the data is directly read into PDV.
- PDV (program Data Vector)
Logical area in memory where sas :
- Build a Dataset by one observation at a time.
- The Data values are assigned to the appropriate variables in the program data vector.
- Two automatic variables named _N_ and _ERROR_ created.
- The _N_ Variable is counter variable it counts the number of time the data steps iterate.
- The occurrences of error are getting recorded in _ERROR_ Variable.
- The _N_ and _ERROR_ Variable are not written in an output
- Descriptor Information
Descriptor part of sas does most vital things as follows :
- Creates and maintain each dataset.
- All the variable dataset, variables name type length and all other attributes are recorded into it.

Question 49

Write SAS code to import drug codes from excel sheet and create a variable to be used in the sas program to identify drugs in a dataset?

Accepted Answer

The name of the excel sheet containing the drugs code is “all_other_drugs.xlsx”

Proc import OUT= drugs_code_ndc(keep=NDC) /*this line keep NDC code only from the excel sheet and creating a dataset named “drugs_code_ndc”. */

Datafile= "%sysfunc(pathname(saslib))/all_other_drugs.xlsx" /*this line importing the excel file from saslib and the file name is “all_other_drugs.xlsx” */

DBMS=xlsx REPLACE;                                                                                                   
SHEET="sheet1";                                                                                                              
GETNAMES=YES;                                                                                                               
Run; 
proc sql noprint;
select distinct compress("'"||NDC||"'") into : DrugNDC_Excl_Code separated by "," from drugs_code_ndc;
quit;
%put &DrugNDC_Excl_Code;   /* Note this can be be useful way to detect the value of variables.*/

Here we can use like “if drug_code in(&DrugNDC_Excl_Code) then drug =1; this will flag all the patients who have the drugs in the list.

Question 50

All the patients who have the drugs in the list. How to export data in CSV file through SAS? Dataset Name – Pat_details

Obs	A	i	Year
1	109250.00	0.10175	1
2	120366.19	0.11193	2
3	133838.17	0.12312	3
4	150315.99	0.13543	4
5	170673.18	0.14897	5

Accepted Answer

Expect to come across this popular question in SAS scenario based interview questions.

ods csv file="%sysfunc(pathname(saslib))/pat_details_data.csv";
Proc print data = Pat_details; run;
odscsv close;

This will export all the data in csv format at the location “saslib” at path mention above.

Question 51

How to generate reports in rtf/word format in sas?Dataset Name – Pat_details

Obs	A	i	Year
1	109250.00	0.10175	1
2	120366.19	0.11193	2
3	133838.17	0.12312	3
4	150315.99	0.13543	4
5	170673.18	0.14897	5

Accepted Answer

ods rtf file="%sysfunc(pathname(saslib))/pat_details_data.rtf";
Proc freq data = Pat_details;

Table all Linaclotide Lubiprostone Plecanatide/missing list;

Run;
Ods rtf close;

This will export the frequency report in .rtf format at the location “saslib” at path mention above.

Question 52

How to code in SAS to flag the MM diagnosis present in var diag1-diag12 in the database between jan01, 2010 to Dec 31, 2018?

Accepted Answer

Data Allclm_MM;
set mm_diag_data;
if '01JAN2010'D <= date_svc <= '31DEC2018'D ;  
array diag(*) $diag1 – diag12;             
do i=1 to 12;
if (icd_ver = "9" and diag(i) in :( "2030" )) or (icd_ver = "10" and diag(i) in :( "C900" )) then f_MM_diag = 1;
end;
if f_MM_diag = 1;
run;

Question 53

Write SAS code for counting the number of patients with the following drugs mentioned in the given table –Dataset Name – Pat_details

Obs	A	i	Year
1	109250.00	0.10175	1
2	120366.19	0.11193	2
3	133838.17	0.12312	3
4	150315.99	0.13543	4
5	170673.18	0.14897	5

Accepted Answer

We can count this by using FREQ procedure that prints all values of a given categorical variable in the Output window, along with the number and percentage of times each value appears. The FREQ procedure can work with both string (character) and numeric categorical variables. In the case mentioned above, we can code like –

Proc freq data = Pat_details;

Table all Linaclotide Lubiprostone Plecanatide/missing list;

Run;

This code will give the counts of the variable mentioned in the table statement and results will look like the one mentioned below -

Output –

Cumulative Cumulative

all Frequency Percent Frequency Percent

-----------------------------------------------------------------------

1 10 100.00 10 100.00

Cumulative Cumulative

Linaclotide Frequency Percent Frequency Percent

-----------------------------------------------------------------------

1 7 70.00 10 100.00

Cumulative Cumulative

Lubiprostone Frequency Percent Frequency Percent

-----------------------------------------------------------------------

1 2 20.00 10 100.00

Cumulative Cumulative

Plecanatide Frequency Percent Frequency Percent

-----------------------------------------------------------------------

1 1 10.00 10 100.00

Question 54

Write sas code to split the table below on the basis of different drugs. Dataset Name – Pat_details

Obs	A	i	Year
1	109250.00	0.10175	1
2	120366.19	0.11193	2
3	133838.17	0.12312	3
4	150315.99	0.13543	4
5	170673.18	0.14897	5

Accepted Answer

Proc sort data = Pat_details; by pat_id; run;
Data all_drug Linaclotide Lubiprostone Plecanatide;
Set  Pat_details;
by pat_id;
If all = 1 then output all_drug;
If Linaclotide =1 then output Linaclotide;
If  Lubiprostone =1 then output Lubiprostone;
If  Plecanatide = 1 then output Plecanatide;
Run;

The code above will create three different datasets with name all_drug Linaclotide Lubiprostone Plecanatide respectively in work library which has only respective drugs patients.

Question 55

Count the patients who initiated more than one drugs in a month the Input dataset for the same is mentioned below - Input Dataset – All_first_drug

Obs	A	i	Year
1	109250.00	0.10175	1
2	120366.19	0.11193	2
3	133838.17	0.12312	3
4	150315.99	0.13543	4
5	170673.18	0.14897	5

Accepted Answer

proc transpose data = All_first_drug out = t_ All_first_drug (where =(date ne .) rename=(_name_= drug_name col1=date));

by ps_indv_id; run;
data t_ All_first_drug;
set t_ All_first_drug;
x=put(date,monyy7.);
run;
proc sql;
 create table final_data as select x, count(distinct ps_indv_id) from (
 select distinct x, ps_indv_id, count(ps_indv_id) as cnt 
from t_ All_first_drug group by x, ps_indv_id having count(ps_indv_id)>1)
group by x order by x;
quit;

result –

Obs       x _TEMG003
  1    JAN2018        2   
  2    FEB2018        1   
  3    MAR2018       3

Question 56

What is the macro variable and what are the ways to create it?

Accepted Answer

Macro in sas is the way of automation and it is a very powerful and flexible way to do automation.

Macro variables can be created using a number of different techniques and statements. However, the five most common methods are not only the most useful but also among the easiest to master. Since macro variables are used in so many ways within the macro language, learning how they are created can also serve as an excellent introduction to the language itself.

These methods include:

%LET statement
macro parameters (named and positional)
iterative %DO statement
using the INTO in PROC SQL

using the CALL SYMPUTX routine

Question 57

What is a substr() function and how it can be used?

Accepted Answer

The substr function extracts a text from the character variable, it could also be used to replace the values, when it is on the left side it assigns the new values, suppose you need to change the first two values from 91 to 00, you can assign by putting it to left side;

data a; x='(91) 9999265789'; substr(x,2,2)='00'; run;
proc print data=a; run;

output –

Obs        x
1 (00) 9999265789

Question 58

What is scan function in sas and how it is used?

Accepted Answer

A must-know for anyone heading into a SAS interview, this question is frequently asked in SAS interview questions.

SCAN function - The scan function searches for a particular string and puts the value in the target variable, the target variable length using the scan function is 200 chars, here value is of 200 char variable now, the delimiter by default is blank.

It also takes 3 arguments first being string second the nth word and third the delimiter (by default blank), it divides a string into the chunks divided by delimiters;

data a;

x='Champ are always best people’;

value=scan(x,5); run;

proc print data=a;

run;

output –

Obs x value

1 'Champs are always best people people

Question 59

How to read inconsistent data using SAS functions?

Accepted Answer

This is very important thing, consider the below example, the data is not consistent and you wanted to pick the last name;

 data x;
input @1 name $5. @7 marks;
 datalines;
 a b c 20 
k     d 15 
a       25 
n a b 20 
k   a 30 ;
 run;
data master; 
set x; lname=scan(name,-1); 
run; 
proc print data=master; run;
/********email id example**********/
data email; 
length id $25; 
input id$; 
datalines;
 a,b@n.com 
b@tt.com a@b@cc.com ;
 run;
data a; 
set email; 
domain_name=scan(id,-1,'@'); run;
proc print data=a; run;

Output –

Obs   name       marks   lname
1       a b c         20     c
2       k d         15     d
3      a           25   a
4      n a b         20     b
5      k a           30   a
Obs    id domain_name
1     a,b@n.com       n.com
2     b@tt.com         tt.com
3   a@b@cc.com     cc.com

Question 60

How to export data in HTML file through SAS? Dataset Name – Pat_details

Obs	A	i	Year
1	109250.00	0.10175	1
2	120366.19	0.11193	2
3	133838.17	0.12312	3
4	150315.99	0.13543	4
5	170673.18	0.14897	5

Accepted Answer

ods html file="%sysfunc(pathname(saslib))/pat_details_data.html";

Proc print data = Pat_details; run;

Ods html close;

This will export all the data in html format at the location “saslib” at path mention above.

Obs	Pt	visit	Ae
1	1	1	Headache
2	1	2	Kill
3	1	3	Pain
4	1	4	Knee
5	2	1	Tick
6	3	2	Fle

Obs	pat_id	first_date_all	all	Linaclotide	Lubiprostone	Plecanatide
1	11121	27-Jun-18	1	1	0	0
2	11394	16-Feb-18	1	1	0	0
3	13755	13-Mar-19	1	1	0	0
4	15461	24-Jan-18	1	0	1	0
5	13657	23-Mar-19	1	1	0	0
6	14327	19-Mar-19	1	0	0	1
7	18999	27-Mar-19	1	1	0	0
8	19133	14-Feb-18	1	1	0	0
9	16260	19-Apr-18	1	0	1	0
10	17194	4-Jan-18	1	1	0	0

Obs	pat_id	first_date_all	all	Linaclotide	Lubiprostone	Plecanatide
1	11121	27-Jun-18	1	1	0	0
2	11394	16-Feb-18	1	1	0	0
3	13755	13-Mar-19	1	1	0	0
4	15461	24-Jan-18	1	0	1	0
5	13657	23-Mar-19	1	1	0	0
6	14327	19-Mar-19	1	0	0	1
7	18999	27-Mar-19	1	1	0	0
8	19133	14-Feb-18	1	1	0	0
9	16260	19-Apr-18	1	0	1	0
10	17194	4-Jan-18	1	1	0	0

Obs	pat_id	first_date_all	all	Linaclotide	Lubiprostone	Plecanatide
1	11121	27-Jun-18	1	1	0	0
2	11394	16-Feb-18	1	1	0	0
3	13755	13-Mar-19	1	1	0	0
4	15461	24-Jan-18	1	0	1	0
5	13657	23-Mar-19	1	1	0	0
6	14327	19-Mar-19	1	0	0	1
7	18999	27-Mar-19	1	1	0	0
8	19133	14-Feb-18	1	1	0	0
9	16260	19-Apr-18	1	0	1	0
10	17194	4-Jan-18	1	1	0	0

Obs	pat_id	first_date_all	all	Linaclotide	Lubiprostone	Plecanatide
1	11121	27-Jun-18	1	1	0	0
2	11394	16-Feb-18	1	1	0	0
3	13755	13-Mar-19	1	1	0	0
4	15461	24-Jan-18	1	0	1	0
5	13657	23-Mar-19	1	1	0	0
6	14327	19-Mar-19	1	0	0	1
7	18999	27-Mar-19	1	1	0	0
8	19133	14-Feb-18	1	1	0	0
9	16260	19-Apr-18	1	0	1	0
10	17194	4-Jan-18	1	1	0	0

Obs	name	Age	Dob
1	am	67	Oct 13, 1981
2	pr	66	Apr 10, 1982

Obs	pat_id	first_date_Linaclotide	first_date_Lubiprostone	first_date_Plecanatide
1	18830	27-Jan-18		30-Jan-18
2	17224	16-Feb-18	20-Feb-18
3	11702	13-Mar-18	18-Mar-18
4	19025	30-Jan-18	24-Jan-18
5	19248	23-Mar-18	13-Mar-18
6	14557		22-Mar-18	19-Mar-18
7	16519	27-Mar-18
8	17499	28-Mar-18
9	18873	29-Mar-19	19-Apr-18
10	11121	30-Mar-19
11	11394	31-Mar-19
12	13755	1-Apr-19

SAS Interview Questions and Answers Business Management

Advanced

Intermediate

Beginner

Description

Related Interview Questions

Useful links