Interview questions

1. What version of SAS are you currently using? • SAS 9.2 2. What does the difference between combining 2 datasets using multiple SET statement and MERGE statement? • 3. Describe your familiarity with SAS Formats / Informats. 4. Can you state some special input delimiters? 5. It is possible to use the MERGE statement without a BY statement? Explain? 6. What is the purpose of trailing @ and @@ ? 7. For what purposes do you use DATA _NULL_? 8. Identify statements whose placement in the DATA step is critical. 9. What is a method for assigning first.VAR and last.VAR to the BY group variable on unsorted data? 10. How would you delete duplicate observations? 11. What is the Program Data Vector (PDV)? What are its functions? 12. What are SAS/ACCESS and SAS/CONNECT? 13. How would you determine the number of missing or nonmissing values in computations? 14. What is the difference between %LOCAL and %GLOBAL? 15. What is auto call macro and how to create a auto call macro? What is the use of it? 16. If you use a SYMPUT in a DATA step, when and where can you use the macro variable? 17. Name atleast five compile time statements? 18. SCAN vs. SUBSTR function? 19. Describe the ways in which you can create macro variables? 20. State some differences between the DATA Step and SQL? 21. What system options are you familiar with? 22. What is the COALESCE function? 23. Give some differences between PROC MEANS and PROC SUMMARY? 24. Have you used Call symputx ? What points need to be kept in mind when using it? 25. What option in PROC FORMAT allows you to create a format from an input control data set rather than VALUE statement? 26. How would you code a macro statement to produce information on the SAS log? 27. Name four set operators? 28. What does %put do? 29. Which gets applied first when using the keep= and rename= data set options on the same data set? 30. Give example of macro quoting functions? 31. What system option determines whether the macro facility searches a specific catalog for a stored, compiled macro? 32. Do you know about the SAS autoexec file? What is its significance? 33. What exactly is a sas hash table?

34. What is PROC FREQ’s default behavior for handling missing values? 35. I am trying to find the ways to find outliers in data. Which procedures will help me find it? 36. Have you used ODS Statements? What are benefits of ODS? 37. I want to make a quick backup of a data sets along with any associated indexes What procedure can I use? 38. What is a sas catalog? 39. What does the statement ‘format _all_;’ do? 40. State different ways of combining sas datasets? 41. State different ways of getting data into SAS? 42. What is The SQL Procedure Pass-Through Facility? 43. How can you Identify and resolve programming logic errors? 44. Can a FORMAT, LABEL, DROP, KEEP, or LENGTH statements use array references? 45. What is sas PICTURE FORMATS? Question: What is the function of output statement? Answer: To override the default way in which the DATA step writes observations to output, you can use an OUTPUT statement in the DATA step. Placing an explicit OUTPUT statement in a DATA step overrides the automatic output, so that observations are added to a data set only when the explicit OUTPUT statement is executed. Question: What is the function of Stop statement? Answer: Stop statement causes SAS to stop processing the current data step immediately and resume processing statement after the end of current data step. Question : What is the difference between using drop= data set option in data statement and set statement? Answer: If you don’t want to process certain variables and you do not want them to appear in the new data set, then specify drop= data set option in the set statement. Whereas If want to process certain variables and do not want them to appear in the new data set, then specify drop= data set option in the data statement. Question: Given an unsorted dataset, how to read the last observation to a new data set? Answer: using end= data set option. For example: data work.calculus; set work.comp end=last; If last; run; Where Calculus is a new data set to be created and Comp is the existing data set last is the temporary variable (initialized to 0) which is set to 1 when the set statement reads the last observation.

Question : What is the difference between reading the data from external file and reading the data from existing data set ? Answer: The main difference is that while reading an existing data set with the SET statement, SAS retains the values of the variables from one observation to the next. Question: What is the difference between SAS function and procedures? Answer: Functions expects argument value to be supplied across an observation in a SAS data set and procedure expects one variable value per observation. For example: data average ; set temp ; avgtemp = mean( of T1 – T24 ) ; run ; Here arguments of mean function are taken across an observation. proc sort ; by month ; run ; proc means ; by month ; var avgtemp ; run ; Proc means is used to calculate average temperature by month (taking one variable value across an observation). Question: Differnce b/w sum function and using “+” operator? Answer: SUM function returns the sum of non-missing arguments whereas “+” operator returns a missing value if any of the arguments are missing. Example: data mydata; input x y z; cards; 33 3 3 24 3 4 24 3 4 .32 23 . 3 54 4 . 35 4 2 ; run; data mydata2; set mydata; a=sum(x,y,z); p=x+y+z; run; In the output, value of p is missing for 3rd, 4th and 5th observation as :

41 41 Question: What would be the result if all the arguments in SUM function are missing? Answer: a missing value Question: What would be the denominator value used by the mean function if two out of seven arguments are missing? Answer: five Question: Give an example where SAS fails to convert character value to numeric value automatically? Answer: Suppose value of a variable PayRate begins with a dollar sign ($). 58 .00 Answer: 1735 Question: Which SAS statement does not perform automatic conversions in comparisons? Answer: where statement Question: Briefly explain Input and Put function? Answer: Input function – Character to numeric conversionInput(source.informat) put function – Numeric to character conversionput(source. 2000 is Sunday)? .ap 39 39 31 31 31 31 5. When SAS tries to automatically convert the values of PayRate to numeric values.735.format) Question: What would be the result of following SAS function(given that 31 Dec. The values cannot be converted to numeric values. 26 . the dollar sign blocks the process. Therefore.00 Answer: a missing value Question: What would be the resulting numeric value (generated by automatic char to numeric conversion) of a below mentioned character value when used in arithmetic calculation? 1735. Question: What would be the resulting numeric value (generated by automatic char to numeric conversion) of a below mentioned character value when used in arithmetic calculation? 1. it is always best to include INPUT and PUT functions in your programs when conversions occur.

Weeks = intck (‘week’.earnings.'). find. .delimiters) argument specifies the character variable or expression to scan n specifies which word to read delimiters are special characters that must be enclosed in single quotation marks Question: Suppose the variable address stores the following expression: 209 RADCLIFFE ROAD. end. Catx. Months = intck (‘month’.’01jan2001′d). What would be the value of month at the end of data step execution and how many observations would be there? Answer: Value of month would be 13 No.3). tranwrd. Question: What is the function of tranwrd function? Answer: TRANWRD function replaces or removes all occurrences of a pattern of characters within a character string. CENTER CITY. run. Index. Answer: Weeks=0.’31 dec 2000′d.075/12.3. Sum. Years=1. Years = intck (‘year’. Earned+(amount+earned)*(rate). trim. 92716 What would be the result returned by the scan function in the following cases? a=scan(address. Amount=1000. Question: Consider the following SAS Program data finance. Rate=.’. Amount=1000.’31 dec 2000′d.n.Months=1 Question: What are the parameters of Scan function? Answer: scan(argument. NY. Substr. do month=1 to 12. Answer: a=Road.’01jan2001′d). of observations would be 1 Question: Consider the following SAS Program data finance. b=scan(address. Rate=.’01jan2001′d).’31 dec 2000′d. b=NY Question: What is the length assigned to the target variable by the scan function? Answer: 200 Question: Name few SAS functions? Answer: Scan.075/12.

do month=1 to 12. it must be character data type. do i=1 to 20 until(Sum>=20000). Whereas DO UNTIL executes at least once. then the DO loop never executes. Zip are numeric digits and can be character data type. Question: If a variable contains letters or special characters. Earned+(amount+earned)*(rate). What can be the size of largest dataset in SAS? Answer: The number of observations is limited only by computer’s capacity to handle and store them. Question: What is the difference between do while and do until? Answer: An important difference between the DO UNTIL and DO WHILE statements is that the DO WHILE expression is evaluated at the top of the DO loop. whichever occurs first. Question: How do you specify number of iterations and specific condition within a single do loop? Answer: data work. output. This iterative DO statement enables you to execute the DO loop until Sum is greater than or equal to 20000 or until the DO loop executes 10 times. . How many observations would be there at the end of data step execution? Answer: 12 Question: How do you use the do loop if you don’t know how many times should you execute the do loop? Answer: we can use do until or do while to specify the condition. run. end. Numeric Question: If a variable contains only numbers. run. Question. can it be numeric data type? Answer: No. Sum+2000. If the expression is false the first time it is evaluated. can it be character data type? Also give example Answer: Yes. Year+1. it depends on how you use the variable Example: ID. Question: How many data types are there in SAS? Answer: Character. end.10. Sum+Sum*.

PROC REPORT does a sum as a default. the maximum number of variables in a SAS data set is limited by the resources available on your computer. run. run. A 2 23 B 4 45 C 3 56 D 9 43 .1. data mydat. proc report data = mydat1 nowd. run. column Grade ID Age. cards. data mydat1. cards. input ID Age. b. run. SAS data sets could contain up to 32.767 variables. 2 23 4 45 3 56 9 43 . Question: Highlight the major difference between below two programs: a. Rows ordered as they appear in data set. • • • • • .Prior to SAS 9. Question: Give some example where PROC REPORT’s defaults are different than PROC PRINT’s defaults? Answer: No Record Numbers in Proc Report Labels (not var names) used as headers in Proc Report REPORT needs NOWINDOWS option Question: Give some example where PROC REPORT’s defaults are same as PROC PRINT’s defaults? Answer: Variables/Columns in position order. proc report data = mydat nowd. column ID Age. input grade $ ID Age.Thus first program generates one record in the list report whereas second generates four records. In SAS 9.1. Answer: When all the variables in the input file are numeric.

Group variables produce list report whereas order variable produces summary report. Question: How to specify variables to be processed by the FREQ procedure? Answer: By using TABLES Statement. and maximum Question: How to limit decimal places for variable using PROC MEANS? Answer: By using MAXDEC= option Question: What is the difference between CLASS statement and BY statement in proc means? Answer: Unlike CLASS processing. • • • • . across. column ID Age. define ID/display. By contrast. one or more of the input variables must be defined as DISPLAY. BY processing requires that your data already be sorted or indexed in the order of the BY variables. or Computed variables. analysis. mean. Question: What is the difference between PROC MEANS and PROC Summary? Answer: The difference between the two procedures is that PROC MEANS produces a report by default. to produce a report in PROC SUMMARY. standard deviation. minimum. how will you avoid having the sum of numeric variables? Answer: To avoid having the sum of numeric variables. Question: Give some ways by which you can define the variables to produce the summary report (using proc report)? Answer: All of the variables in a summary report must be defined as group. Questions: What are the default statistics for means procedure? Answer: n-count.Question: In the above program. you must include a PRINT option in the PROC SUMMARY statement. Question: What is the difference between Order and Group variable in proc report? Answer: If the variable is used as group variable. rows that have the same values are collapsed. BY group results have a layout that is different from the layout of CLASS group results. Thus we have to use : proc report data = mydat nowd. run.

run. Categorical Question: How can you combine two datasets based on the relative position of rows in each data set. How to rename a b to e & f? Answer: data concat(rename=(a=e b=f)). set a b. run. in the order in which the data sets and BY variables are listed then which method of combining datasets will work for this? Answer: Interleaving Question: While match merging two data sets. you cannot use the __________option with indexed data sets because indexes are always stored in ascending order. TABLES variable-1*variable-2 <* … variable-n> / LIST. Answer: Descending Question: I have a dataset concat having variable a b & c. Question: How to create list output for crosstabulations in proc freq? Answer: To generate list output for crosstabulations. add a slash ( /) and the LIST option to the TABLES statement in your PROC FREQ step. that is. Answer: If both data sets in the merge statement are sorted by id(as shown below) and each observation in one data set has a . set concat.2 Question: If you have two datasets you want to combine them in the manner such that observations in each BY group in each data set in the SET statement are read sequentially. Question: Proc Means work for ________ variable and Proc FREQ Work for ______ variable? Answer: Numeric. Question : What is the difference between One to One Merge and Match Merge? Give example also.. the first observation in one data set is joined with the first observation in the other.2 What would be the format of Revenue in resulting dataset (concat)? Answer: dollar10. and so on? Answer: One to One reading Question: data concat.2 and format of variable Revenue in dataset b is dollar12.Question: Describe CROSSLIST option in TABLES statement? Answer: Adding the CROSSLIST option to TABLES statement displays crosstabulation tables in ODS column format. format of variable Revenue in dataset a is dollar10.

input id class1 $. 1 Sac 2 Sdf 3 Rdd 3 Lks 5 Ujf . 1 Sa 2 Sd 3 Rd 4 Uj . data mydata2. input id class $. data mydata2. input id class $. then match merging is suitable data mydata1. 1 Sac 2 Sdf 3 Rdd 4 Lks . run. by id run. data mydata1. input id class1 $. If the observations do not match. a one-to-one merge is suitable. cards. 1 Sa 2 Sd 2 Sp 3 Rd 4 Uj . cards.A4 and VAR A1 — A4? What do the SAS log messages "numeric values have been converted to character" mean? What are the implications? . data mymerge. cards. data mymerge. • • • What is the effect of the OPTIONS statement ERRORS=1? What’s the difference between VAR A1 . merge mydata1 mydata2.corresponding observation in the other data set. cards. merge mydata1 mydata2.

VAR to the BY group variable on unsorted data? What is the order of application for output data set options. sort order. peer review. an update. or QC review? Have you ever used the SAS Debugger? What other SAS features do you use for error trapping and data validation? How does SAS handle missing values in: assignment statements. formats.• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Why is a STOP statement needed for the POINT= option on a SET statement? How do you control the number of observations and/or variables read or written? Approximately what date is represented by the SAS date value of 730? How would you remove a format that has been permanently associated with a variable?? What does the RUN statement do? Why is SAS considered self-documenting? What areas of SAS are you most interested in? Briefly describe 5 ways to do a "table lookup" in SAS. show how you would do this using arrays and with PROC TRANSPOSE? What are _numeric_ and _character_ and what do they do? How would you create multiple observations from a single observation? For what purpose would you use the RETAIN statement? What is a method for assigning first. input data set options and SAS statements? What is the order of evaluation of the comparison operators: + . functions. a merge. What versions of SAS have you used (on which platforms)? What are some good SAS programming practices for processing very large data sets? What are some problems you might encounter in processing missing values? In Data steps? Arithmetic? Comparisons? Functions? Classifying data? How would you create a data set with 1 observation and 30 variables from a data set with 30 observations and 1 variable? What is the different between functions and PROCs that calculate the same simple descriptive statistics? If you were told to create many records from one record.VAR and last.* / ** ( )? How could you generate test data with no input data? How do you debug and test your SAS programs? What can you learn from the SAS log when debugging? What is the purpose of _error_? How can you put a "trace" in your program? Are you sensitive to code walk-throughs. PROCs? How many missing values are available? When might you use them? How do you test for missing values? How are numeric and character missing values represented internally? .

District and County as the primary variables. How would you delete duplicate observations? How would you delete observations with duplicate keys? How would you code a merge that will keep only the observations that have matches from both sets. if any? How would you code the criteria to restrict the output to be produced? What is the purpose of the trailing @? The @@? How would you use them? Under what circumstances would you code a SELECT construct instead of IF statements? What statement do you code to tell SAS that it is to write to an external file? What statement do you code to write the record to the file? If reading an external file to produce an external file. how would you code the data statement to prevent SAS from producing a set? What is the one statement to set the criteria of data that can be coded in any step? Have you ever linked SAS code? If so.Very Basic What SAS statements would you code to read an external raw data file to a DATA step? How do you read in the variables that you need? Are you familiar with special input delimiters? How are they used? If reading a variable length file with fixed input. describe the link and any required statements used to either process the code or the step itself. . Name and describe three SAS functions that you have used. or indexc? If you have a data set that contains 100 variables. what is the code to force SAS to use only those variable? Code a PROC SORT on a data set containing State. index. How would you include common or reuse code to be processed along with your statements? When looking for data contained in a character string of 150 bytes. along with several numeric variables. but you need only five of those. how would you prevent SAS from reading the next record if the last variable didn't have a value? What is the difference between an informat and a format? Name three informats or formats. what is the shortcut to write that record without coding every single variable on the record? If you're not wanting any SAS output from a data step. which function is the best to locate that data: scan.

At compile time when a SAS data set is read. the non-matches from the left-most data set to a second data set. what items are created? Name statements that are recognized at compile time only? Identify statements whose placement in the DATA step is critical. Name statements that function at both compile and execution time. What versions of SAS have you used (on which platforms)? What are some good SAS programming practices for processing very large data sets? What are some problems you might encounter in processing missing values? *In Data steps? Arithmetic? Comparisons? Functions? Classifying data? How would you create a data set with 1 observation and 30 variables from a data set with 30 observations and 1 variable? What is the different between functions and PROCs that calculate the same simple descriptive statistics? . Internals execution What is the Program Data Vector (PDV)? What are its functions? Does SAS 'Translate' (compile) or does it 'Interpret'? Explain. and the non-matches of the rightmost data set to a third data set.A4? What do the SAS log messages "numeric values have been converted to character" mean? What are the implications? Why is a STOP statement needed for the POINT= option on a SET statement? How do you control the number of observations and/or variables read or written? Approximately what date is represented by the SAS date value of 730? How would you remove a format that has been permanently associated with a variable?? What does the RUN statement do? Why is SAS considered self-documenting? What areas of SAS are you most interested in? Briefly describe 5 ways to do a "table lookup" in SAS. what is the first action in a typical DATA Step? What is _n_? Base SAS What is the effect of the OPTIONS statement ERRORS=1? What's the difference between VAR A1 . In the flow of DATA step processing. Name statements that are execution only.How would you code a merge that will write the matches of both to one data set.A4 and VAR A1 -.

How does SAS handle missing values in: assignment statements.If you were told to create many records from one record.How might you use MOD and INT on numerics to mimic SUBSTR on character strings? .Which date function advances a date. a6.VAR to the BY group variable on unsorted data? What is the order of application for output data set options. functions. It needs to be displayed in the format "ddmonyy" if it's before 1975. formats.What is the difference between: x=a+b+c+d. a9).Name several ways to achieve efficiency in your program. or QC review? Have you ever used the SAS Debugger? What other SAS features do you use for error trapping and data validation? QUESTIONS ON MISSING VALUES .Which is worse: not testing your programs or not commenting your programs? . what does the DIM function do? .How are numeric and character missing values represented internally? SOME GENERAL NON-TECHNICAL QUESTIONS . an update.What do the MOD and INT function do? .What do the PUT and INPUT functions do? . which function is the best to locate that data: scan. or indexc? . .How do you test for missing values? . How would you accomplish this in data step code? Using only PROC FORMAT.What is your favorite operating system? Why? . .How many missing values are available? When might you use them? . peer review. a merge.? .c.b. "dd mon ccyy" if it's after 1985. and x=SUM(a. time or date/time value by a given interval? .There is a field containing a date.Have you ever had to follow SOPs or programming guidelines? .What is your favorite programming language and why? . and as 'Disco Years' if it's between 1975 and 1985. Explain trade-offs. show how you would do this using arrays and with PROC TRANSPOSE? What are _numeric_ and _character_ and what do they do? How would you create multiple observations from a single observation? For what purpose would you use the RETAIN statement? What is a method for assigning first. sort order.? . index.VAR and last.What other SAS products have you used and consider yourself proficient in using? QUESTIONS ON FUNCTIONS .In ARRAY processing.Do you observe any coding standards? What is your opinion of them? .When looking for contained in a character string of 150 bytes. input data set options and SAS statements? What is the order of evaluation of the comparison operators: + . PROCs? .How would you determine the number of missing or nonmissing values in computations? .How do you make use of functions? .What has been your most common programming mistake? .d).What is the significance of the 'OF' in X=SUM(OF a1-a4.What percent of your program code is usually original and what percent copied and modified? .* / ** ( ) ? QUESTIONS ON TESTING AND DEBUGGING - How could you generate test data with no input data? How do you debug and test your SAS programs? What can you learn from the SAS log when debugging? What is the purpose of _error_? How can you put a "trace" in your program? Are you sensitive to code walk-throughs.

if _error_ then description = 'Problems'. amount_per_year = years_service / amount.How would you keep SAS from overlaying the a SAS set with its sorted version? .234 The following SAS program is submitted: data test. set sasuser. can you print only variables that begin with the letter "A"? .. if 2 le years_service le 10 then amount = 1000.3333 then put 'fraction'.If you were given several SAS data sets you were unfamiliar with. run. . QUESTIONS : • • The following SAS program is submitted: data test. 1000 C. *Name the option to produce a frequency line items rather that a table. (missing numeric value) The contents of the raw data file AMOUNT are listed below: --------10-------20-------30 $1. . *Code the option that will allow MEANS to include missing numeric data to be included in the report. how would you find out the variable names and formats of each dataset? . *Code the MEANS to produce output to be used later.In the following DATA step.employees.What are some differences between PROC SUMMARY and PROC MEANS? . 2000 D. 0 B. Restrict the printing of the table. else amount = 0. x=1/3.In PROC PRINT. run. .What is the difference between calculating the 'mean' using the mean function and PROC MEANS? QUESTIONS ON PROCS . Which one of the following values does the variable AMOUNT_PER_YEAR contain if an employee has been with the company for one year? A.. .Have you ever used "Proc Merge"? . *Produce output from a frequency. else if years_service gt 10 then amount = 2000. input @1 salary 6. .PROC MEANS: *Code a PROC MEANS that shows both summed and averaged output of the data. *Code the tables statement to produce a multi-level frequency.Do you use PROC REPORT or PROC TABULATE? Which do you prefer? Explain. infile 'amount'.PROC FREQ: *Code the tables statement for a single-level (most common) frequency. what is needed for 'fraction' to print to the log? data _null_.What SAS PROCs have you used and consider yourself proficient in using? . if x=.

and no SAS data set is created. No Problems C. (missing numeric value) Which one of the following statements is true regarding the SAS automatic _ERROR_ variable? A. The _ERROR_ variable contains the values 'ON' or 'OFF'. Which one of the following is the value of the SALARY variable? A. run.234 D. Which one of the following is the value of the NUMBER variable? A.. The value can not be determined as the program fails to execute due to errors. C. (missing numeric value) D. infile 'namenum'. B. . input @1 salary 6. The _ERROR_ variable contains the values 'TRUE' or 'FALSE'.234 The following SAS program is submitted: data test. Which one of the following is true when SAS encounters a data error in a DATA step? A. run.• • • • else description = 'No Problems'. $1. The _ERROR_ variable can be used in expressions or calculations in the DATA step. C. 1. infile 'amount'. The contents of the raw data file NAMENUM are listed below: --------10-------20-------30 Joe xx The following SAS program is submitted: data test. A note is written to the SAS log explaining the error.234 C. Problems B. The _ERROR_ variable is automatically stored in the resulting SAS data set. run. A note appears in the SAS log that the incorrect data record was . xx B. The value can not be determined as the program fails to execute due to errors. and the DATA step continues to execute. The DATA step stops executing at the point of the error. D. Joe C. input name $ number. Which one of the following is the value of the DESCRIPTION variable? A. The contents of the raw data file AMOUNT are listed below: --------10-------20-------30 $1. B. 1234 B. ' ' (missing character value) D. .

and the resulting DATA set contains observations up to that point. The following SAS program is submitted: data work. cnt + 1. end. NUM_SOLD and COST only . array monthsales {12} . monthsales{i} = sales.allmonths (keep = product month num_sold cost). run. set work.TOTALSALES data set with 60 observations. do i=1 to 12. monthsales{cnt} = sales.• • • saved to a separate SAS file for further examination. if month = 'Jan' then output work. The program runs without errors or warnings and creates the WORK. end. The program fails execution due to data errors.monthlysales(keep = year product sales). retain monthsales {12} .TOTALSALES data set with 60 observations. The program fails execution due to syntax errors. PRODUCT. run. monthsales{i} = sales. The following SAS program is submitted: data work. array monthsales {12} .TOTALSALES data set. The following SAS program is submitted: data work.totalsales (keep = monthsales{12} ). D. set work.TOTALSALES data set. The DATA step stops executing at the point of the error. Which variables does the WORK. do i = 1 to 12. run. D. The program fails execution due to data errors. The program runs with warnings and creates the WORK.january.MONTHLYSALES has one observation per month for each of five years for a total of 60 observations. B. PRODUCT and SALES only B.totalsales. C. D.JANUARY data set contain? A.monthlysales (keep = year product sales). Which one of the following is the result of the above program? A. set work. Which one of the following is the result of the above program? A. C. sales = cost * num_sold. The program executes with warnings and creates the WORK. B. The data set named WORK. The data set named WORK. MONTH.january. The program fails execution due to syntax errors.MONTHLYSALES has one observation per month for each of five years for a total of 60 observations. The program executes without errors or warnings and creates the WORK. keep = product sales.

An incomplete output data set is created due to syntax errors. ERROR: File WORK. B. 3 set ia. 01012000 B. 22 202 ERROR: File WORK. Delete the word THEN on the IF statement. a quoted string. Which one of the following is the value of the variable WEIGHT in the output data set? A.DATA does not exist. KEYS. PRODUCT. A SAS program is submitted and the following SAS log is produced: 2 data gt100. ..THEN.IF. input @1 height 2. (.. OPEN. The contents of the raw data file SIZE are listed below: --------10-------20-------30 72 95 The following SAS program is submitted: data test.DATA does not exist. input @1 date mmddyy10. The value can not be determined as the program fails to execute due to errors. _NULL_. _LAST_. _DATA_.GT. if date = '01012000'd then event = 'January 1st'. Which one of the following corrects the errors in the LOG? A. D. Add an END statement to conclude the IF statement. @4 weight 2. ERROR 22-322: Syntax error. MONTH.MPG. Which one of the following is the value of the EVENT variable? A. (missing numeric value) D.airplanes 4 if mpg gt 100 then output. Place quotes around the value on the IF statement. KEY. 5 run. ERROR 202-322: The option or parameter is not recognized and will be ignored. NUM_SOLD and COST only D. The contents of the raw data file CALENDAR are listed below: --------10-------20-------30 01012000 The following SAS program is submitted: data test.DATA does not exist. 2 . NOBS.OUTPUT. END. ERROR: File WORK. POINT.DATA does not exist. run. Add a semicolon at the end of the SET statement. ERROR: File WORK. infile 'size'. ERROR: File WORK. C. The IA libref was previously assigned in this SAS session. run.• • • C. . January 1st C. expecting one of the following: a name. infile 'calendar'. SALES.DATA does not exist.

Medium. and High only B. and Unknown only D. else if level = 1 then expertise = 'Low'. Low. else expertise = 'High'. High. Unknown. (missing numeric value) • . Medium. if level = .LEVELS data set is listed below: Obs name level 1 Frank 1 2 Joan 2 3 Sui 2 4 Jose 3 5 Burt 4 6 Kelly . (missing numeric value) A SAS PRINT procedure output of the WORK. 95 D. 32 D. 11 B. . and Unknown only C. Which one of the following values does the variable IDNUM contain when the name of the employee is "Ruth"? A. then expertise = 'Unknown'. 72 C. infile 'employee'. . Medium.levels. Low.• B. 7 Juan 1 The following SAS program is submitted: data work. run. if employee_name = 'Ruth' then input idnum 10-11. input employee_name $ 1-4. else input age 7-8. Which of the following values does the variable EXPERTISE contain? A. else if level = 2 or 3 then expertise = 'Medium'. run. 22 C. High. Low. Low. Medium.expertise. set work. and ' ' (missing character value) The contents of the raw data file EMPLOYEE are listed below: --------10-------20-------30 Ruth 39 11 Jose 32 22 Sue 30 33 John 40 44 The following SAS program is submitted: data test.

Unknown C. infile 'employee'. if jobcode = 'Chem2' then description = 'Senior Chemist'. Which one of the following values does the variable AGE contain when the name of the employee is "Sue"? A. run.chemists. run. ' ' (missing character value) The following SAS program is submitted: libname sasdata 'SAS-data-library'. Senior Chemist D. A value for the variable JOBCODE is listed below: JOBCODE CHEM3 Which one of the following values does the variable DESCRIPTION contain? A. data test. if employee_name = 'Sue' then input age 7-8. input employee_name $ 1-4. A value for the variable JOBCODE is listed below: JOBCODE chem2 Which one of the following values does the variable DESCRIPTION contain? A. 40 D. run. (missing numeric value) The following SAS program is submitted: libname sasdata 'SAS-data-library'. . Chem2 B.• • • The contents of the raw data file EMPLOYEE are listed below: --------10-------20-------30 Ruth 39 11 Jose 32 22 Sue 30 33 John 40 44 The following SAS program is submitted: data test. 30 B. if jobcode = 'chem3' then description = 'Senior Chemist'. else input idnum 10-11.chemists. 33 C. set sasdata. data test. else description = 'Unknown'. else description = 'Unknown'. chem3 . set sasdata.

html'. Pass C.shoes. 1 B.html'. proc print data = sasuser. Which one of the following ODS statements completes the program and sends the report to an HTML file? A. ods html = 'sales. run. run. define exam / display format = score.100 = 'Pass'. END B. ods file html = 'sales. C. run. run. ods html file = 'sales. 5 D. The following SAS program is submitted: proc format. 'Boot'). What is the page number on the first page of the report generated by the MEANS procedure step? A.houses. 2 C. 'Slipper' . Unknown C.courses nowd.50 = 'Fail' 51 .html'. Senior Chemist D. 6 • • . CLOSE The following SAS program is submitted: proc means data = sasuser. . How will the EXAM variable value be displayed in the REPORT procedure output? A.5 D. (missing numeric value) The following SAS program is submitted: options pageno = 1. 50.. The variable EXAM has a value of 50. Fail B. proc report data = work. STOP D. value score 1 . where product in ('Sandal' . D. proc means data = sasuser. ods file = 'sales. column exam. QUIT C.5.shoes.html'.• • B. The report created by the PRINT procedure step generates 5 pages of output. B. run. ' ' (missing character value) Which one of the following ODS statement options terminates output being written to an HTML file? A.

B. NONUM B. TODAY D.shoes.2.2 option to the MEANS procedure statement. Add the option FORMAT = 7. non-missing numeric variable values only B. missing numeric variable values and non-missing numeric variable values only C. NONUMBER D. NOPAGE C.• • • • • Which one of the following SAS system options displays the time on a report? A. in the MEANS procedure step.houses std mean max. Which one of the following is needed to display the standard deviation with only two decimal places? A. Sales Report for Last Month All Products All Regions All Figures in Thousands of Dollars The following SAS program is submitted: proc means data = sasuser. Sales Report for Last Month All Products C. non-missing character variables and non-missing numeric variable values only . in the MEANS procedure step. Which one of the following contains the footnote text that is displayed in the report? A. footnote2 'Selected Products Only'.2. footnote3 'All Regions'. TIME B. C. footnote2 'All Products'. var sqfeet. Add the statement MAXDEC = 7. DATETIME Which one of the following SAS system options prevents the page number from appearing on a report? A. D. run. DATE C. Add the option MAXDEC = 2 to the MEANS procedure statement. Unless specified. Add the statement FORMAT STD 7. NOPAGENUM The following SAS program is submitted: footnote1 'Sales Report for Last Month'. which variables and data values are used to calculate statistics in the MEANS procedure? A. footnote4 'All Figures in Thousands of Dollars'. run. All Products All Regions All Figures in Thousands of Dollars D. All Products B. proc print data = sasuser.

where price lt 60000. A realtor has two customers.0 64000 3 3.5 127150 2 2. and non-missing numeric variable values The following SAS program is submitted: proc sort data = sasuser.5 80050 3 2.5 79350 4 2. non-missing character variables. C. var style bedrooms baths price. id style. run. One customer wants to view a list of homes selling for less than $60. Click on the Exhibit button to view the report produced. missing numeric variable values.houses. by style. B. D.0 107250 2 1.0 69250 4 2. where price lt 60000 or price gt 100000. id style.houses.0 86650 3 1.0 65850 4 3. Assuming the PRICE variable is numeric. by style.0 89100 1 1. by style. run. proc print data = sasuser. var bedrooms baths price. run. proc print data = sasuser. C.houses out = houses. • .000.000. The other customer wants to view a list of homes selling for greater than $100. proc print data = sasuser.• D. B. run.0 110700 RANCH 2 1.5 73650 TWOSTORY 4 3. missing character variables.5 102950 Which of the following SAS statement(s) create(s) the report? A.0 55850 2 1.0 94450 3 1.houses. id style. var style bedrooms baths price. which one of the following PRINT procedure steps will select all desired observations? A. id style. style bedrooms baths price CONDO 2 1.0 34550 SPLIT 1 1. where price gt 100000. proc print data = houses.

1 observations and 4 variables C.2 The SAS data set SASUSER. comma8.houses label = "Sale Price". 0 observations and 0 variables B. proc print data = sasuser.0718 DirectBank 0.00 in a report? A.700. run. proc print data = sasuser. Which one of the following represents how many observations and variables will exist in the SAS data set NEWBANK? A. end.• • • • where price lt 60000 and price gt 100000.houses label. 9 observations and 2 variables The following SAS program is submitted: data work. proc print data = sasuser.houses. .0721 VirtualDirect 0.HOUSES contains a variable PRICE which has been assigned a permanent label of "Asking Price". dollar11.clients. do while (calls le 6).houses. run. B. C. 3 observations and 3 variables D. do year = 1 to 3. label price = "Sale Price". run. run.2 B. run. dollar8. proc print data = sasuser.0728 The following SAS program is submitted: data newbank.2 C. comma11. Which one of the following SAS programs temporarily replaces the label "Asking Price" with the label "Sale Price" in the output? A. label price = "Sale Price".houses label. run. set banks. Which one of the following SAS formats is used to display the value as $110. label price "Sale Price". D. capital + 5000.2 D. run. D. proc print data = sasuser. The value 110700 is stored in a numeric variable. The SAS data set BANKS is listed below: BANKS name rate FirstCapital 0. where price lt 60000 or where price gt 100000. calls = 6.

do while (n lt 6). Which one of the following is the value of the variable N in the output data set? A. end. 5 C. end.• • • calls + 1.put(date.put(date. 0 B. Which one of the following is the value of the variable CALLS in the output data set? A.sales. do month = 1 to 12. duration = today( ) . end.). 6 D.ddmmyy10. 5 C. Which one of the following represents how many observations are written to the WORK. run. do year = 1 to 5.SALES data set? A. infile 'file-specification'. run. 6 D. x + 1. n + 1.yymmdd10. B. end. Which one of the following statements completes the program above and computes the duration of the project in days as of today's date? A.10. 1 C.pieces. 60 A raw data record is listed below: --------10-------20-------30 1999/10/25 The following SAS program is submitted: data projectduration. duration = today( ) . run. 4 B.). 7 The following SAS program is submitted: data work. run. 7 The following SAS program is submitted: data work. 4 B. . 5 D. input date $ 1 .

run.yymmdd10. The following SAS program is submitted: data work. 8 bytes D. character. department = trim(dept) number. total = . run. run. C. Product_Number = 5461. duration = today( ) . Item = '1001'.retail. Which one of the following represents the type and length of the variable DATE in the output data set? A.input(date. The value can not be determined as the program fails to execute due to errors. '2000' • • • .).11. infile 'file-specification'.3. Which one of the following is the value of the variable TOTAL in the output data set? A. numeric.).ddmmyy10. duration = today( ) . 1001/5461 B.• C.). cost = '20000'. The following SAS program is submitted: data work.) || input(number. 10 bytes The following SAS program is submitted: data work. 10 bytes C. character. department = input(dept. D. run. date = put('13mar2000'd. Which one of the following is the value of the variable ITEM_REFERENCE in the output data set? A. D.3. (missing numeric value) D. Which one of the following SAS statements completes the program and results in a value of 'Printing750' for the DEPARTMENT variable? A.). Item_Reference = Item'/'Product_Number.15.).3. .).ddmmyy10.10 * cost. department = trim(dept) || put(number. 1001/ 5461 C. department = dept input(number. input dept $ 1 . B.11 number 13 .month. A raw data record is listed below: --------10-------20-------30 Printing 750 The following SAS program is submitted: data bonus. 2000 B. 8 bytes B.input(date.products. numeric.

1). City_Country = substr(First. T B.' .' . run. The following SAS program is submitted: data work. Author = 'Christie. 15 D. 6 B. C. ' ' (missing character value) The following SAS program is submitted: data work.test. Author = 'Agatha Christie'.7)!!'. D.'). 6 C.test. C C.num4). 7 . (missing numeric value) D.• • • • • C.test. of C. average = mean(num1 . Which one of the following is the length of the variable FIRST in the output data set? A.1. 200 The following SAS program is submitted: data work.1.'). England'. average = mean(of num1 . 1 B.' .1. B. average = mean(num1 num2 num3 num4). Title = 'A Tale of Two Cities. Which one of the following is the value of the variable WORD in the output data set? A. Agatha D. First = substr(scan(author. Dickens'. . run. A B. Agatha'. Which one of the following is the length of the variable CITY_COUNTRY in the output data set? A. ' ' (missing character value) Which one of the following SAS statements correctly computes the average of four numerical values? A.2. ' ' (missing character value) The following SAS program is submitted: data work.1.'). Word = scan(title. Which one of the following is the value of the variable FIRST in the output data set? A.1). average = mean(of num1 to num4).test. '!!'England'. First = 'Ipswich. First = substr(scan(author. run. run. Charles J.num4).3. Dickens D.

England C. C. 1941 1 The following SAS program is submitted and references the raw data file above: data coins. First = 'Ipswich. 17 D. Ipswich . It has no effect on variables read with the SET. sum totquantity. retain totquantity 0. City_Country = City!!'. B. England Which one of the following is true of the RETAIN statement in a SAS DATA step program? A. It can be used to assign an initial value to _N_ . A raw data file is listed below: --------10-------20-------30 1901 2 1905 1 1910 6 1925 . Which one of the following completes the program and produces a non-missing value for the variable TOTQUANTITY in the last observation of the output data set? A. totquantity + quantity. Ipswich!! B. Ipswich. C. run. It is only valid in conjunction with a SUM function. run. totquantity = totquantity + quantity.• • • C. totquantity 0.10 apples 2. A raw data file is listed below: --------10-------20-------30 squash 1. Ipswich. '!!'England'. City = substr(First. input year quantity.test.69 • . B. It adds the value of an expression to an accumulator variable and ignores missing values. D.1. D. 'England' D. 25 The following SAS program is submitted: data work.25 juice 1. infile 'file-specification'. Which one of the following is the value of the variable CITY_COUNTRY in the output data set? A.7). totquantity = sum(totquantity + quantity). England'. MERGE and UPDATE statements.

total. set work.department. run.total. input item $ cost. FIRST.DEPARTMENT and LAST.salary(keep = department wagerate). The BY statement in the DATA step causes a syntax error.cost). by department.SALARY data set.salary(keep = department wagerate). ANSWERS : • • . 5 B. if last. The following SAS program is submitted: data work. currently ordered by DEPARTMENT. 100 D. contains 100 observations for each of 5 departments.SALARY.cost). 500 The following SAS program is submitted: data work. set work.DEPARTMENT are variables in the WORK. C. C.department then payroll = 0. if first.department then payroll = 0.SALARY contains 10 observations for each department. if first. if last. The values of the variable PAYROLL represent a total for all values of WAGERATE in the WORK. B. by department. grandtot = sum cost. infile 'file-specification'. output grandtot.TOTAL data set. 20 C. Which one of the following completes the program and produces a grand total for all COST values? A. grandtot = sum(grandtot. The SAS data set named WORK. Which one of the following is true regarding the program above? A. grandtot = sum(grandtot.The following SAS program is submitted using the raw data file above: data groceries.SALARY data set. retain grandtot 0. The SAS data set WORK. run. The values of the variable PAYROLL represent the total for each department in the WORK.department. payroll + wagerate.cost). D. Which one of the following represents how many observations the WORK. run. grandtot = sum(grandtot. currently ordered by DEPARTMENT. payroll + wagerate. B.TOTAL data set contains? A. D.

. dollar. Informats: comma. They should be included in the infile statement. Comma separated values files or CSV files are a common type of file that can be used to read with the DSD option. Format is to write the data. PERCENTw. Informats read the data.Formats: WORDIATE18. TIMEw.1: 2: 3: 4: 5: d a c d d 11: 12: 13: 14: 15: b a b d d 21: 22: 23: 24: 25: d b c b a 31: 32: 33: 34: 35: b d c b d 6: b 7: b 8: b 9: d 10: d 16: b 17: b 18: d 19: d 20: c 26: a 27: c 28: b 29: d 30: c 36: c 37: d 38: d 39: a 40: b 41: d 42: a 43: b 44: d 45: d 46: c or d 47: a 48: c 49: a 50: d or c What SAS statements would you code to read an external raw data file to a DATA step? INFILE statement. DSD also ignores the delimiters enclosed in quotation marks.If the input of some data lines are shorter than others then we use TRUNCOVER option in the infile statement. · What is the difference between an informat and a format? Name three informats or formats.(missing values have a length of . · If reading a variable length file with fixed input. · Are you familiar with special input delimiters? How are they used? DLM and DSD are the delimiters that I’ve used. DSD option treats two delimiters in a row as MISSING value. if any? LENGTH: returns the length of an argument not counting the trailing blanks.. date. weekdatew. · How do you read in the variables that you need? Using Input statement with the column pointers like @5/12-17 etc. · Name and describe three SAS functions that you have used. DATEw. Formats can be same as informatsInformats: MMDDYYw. how would you prevent SAS from reading the next record if the last variable didn't have a value? By using the option MISSOVER in the infile statement.

b=’cat’. Ex: data dsn. X=SUBSTR(a. hold that line of raw data”.1).@@ holds the value till a input statement or end of the line. x=LENGTH(a). · What is the purpose of the trailing @ and the @@? How would you use them? @ holds the value past the data step. run.5.Ex: x=Sum(3. result: x=9. A=’(916)734-6241’.3). · How would you code the criteria to restrict the output to be produced? Use NOPRINT option. Ex: a=’my ‘. F 53 F 56 F 60 F 60 F 78 F 87 F 102 F 117 F 134 .position. The line hold specifies like a stop sign telling SAS. Result: x=6… SUBSTR: SUBSTR(arg.2. SUM: sum of non missing values. cards.0 INT: Returns the integer portion of the argument. Double trailing @@: When you have multiple observations per line of raw data. we should use double trailing signs (@@) at the end of the INPUT statement.1)Ex: a=’my cat’. input sex $ days.X= TRIM(a)(b). RESULT: x=’mycat’. TRIM: removes trailing blanks from character expression.n) extracts a substring from an argument starting at ‘position’ for ‘n’ characters or until end if no ‘n’. ex: data dsn. RESULT: x=’916’ . “stop.

Don’t touch that dial”. it is as if you are telling SAS. run.. Trailing @: By using @ without specifying a column.. The above program can be changed to make the program shorter using @@ . Otherwise (optional): specifies a statement to be executed if no WHEN condition is met.When: identifies SAS statements that are executed when a particular condition is true. F 53 F 56 F 60 F 60 F 78 F 87 F 102 F 117 F 134 F 160 F 277M 46 M 52 M 58 M 59 M 77 M 78 M 80 M 81 M 84 M 103 M 114M 115 M 133 M 134 M 175 M 175 . · Under what circumstances would you code a SELECT construct instead of IF statements? When you have a long series of mutually exclusive conditions and the comparison is numeric.” stay tuned for more information. cards. run. input sex $ days @@. using a SELECT group is slightly more efficient than using IF-THEN or IF-THEN-ELSE statements because CPU time is reduced.F 160 F 277 M 46 M 52 M 58 M 59 M 77 M 78 M 80 M 81 M 84 M 103 M 114 M 115 M 133 M 134 M 175 M 175 . SELECT GROUP: Select: begins with select group. .. data dsn. SAS will hold the line of data until it reaches either the end of the data step or an INPUT statement that does not end with the trailing.

· How would you include common or reuse code to be processed along with your statements? By using SAS Macros. · When looking for data contained in a character string of 150 bytes.· If you have a data set that contains 100 variables. describe the link and any required statements used to either process the code or the step itself . District and County as the primary variables. which function is the best to locate that data: scan. how would you code the data statement to prevent SAS from producing a set? Data _Null_ · What is the one statement to set the criteria of data that can be coded in any step? Options statement: This a part of SAS program and effects all steps that follow it.End: ends a SELECT group. ·What statement you code to tell SAS that it is to write to an external file? . what is the shortcut to write that record without coding every single variable on the record? · If you're not wanting any SAS output from a data step. · Code a PROC SORT on a data set containing State.what is the code to force SAS to use only those variable? Using KEEP option or statement. or indexc? SCAN.What statement do you code to write the record to the file? PUT and FILE statements. along with several numeric variables. · Have you ever linked SAS code? If so. . index. Run . BY State District County . but you need only five of those. · If reading an external file to produce an external file. · How would you delete duplicate observations? NONUPLICATES . Proc sort data=one.

Check the condition by using If statement in the Merge statement while merging datasets. · What is the Program Data Vector (PDV)? What are its functions? Function: To store the current obs.com/proceedings/sugi24/Posters/p235 -24. Input Buffer. The PDV is the area of memory where SAS builds dataset. SAS compiles the code· At compile time when a SAS data set is read. After input buffer is created the PDV is created. · Does SAS 'Translate' (compile) or does it 'Interpret'? Explain. The Logical Program Data Vector (PDV) is a set of buffers that includes all variables referenced either explicitly or implicitly in the DATA step. Step1: Define 3 datasets in DATA step Step2: Assign values of IN statement to different variables for 2 datasets Step3: Check for the condition using IF statement and output the matching to first dataset and no matches to different datasets Ex: data xxx. then used at execution time as the location where the working values of variables are stored as they are processed by the DATA step program(source:http://www2.PDV (Program Data Vector) is a logical area in memory where SAS creates a dataset one observation at a time. if inxxx = 1 and inyyy = 1. It is created at compile time. run. During the compilation phase the input buffer is created to hold a record from external file.pdf). one observation at a time. When SAS processes a data step it has two phases.sas. what items are created?Automatic variables are created. the non-matches from the left-most data. · How would you code a merge that will write the matches of both to one data set. merge yyy(in = inxxx) zzz (in = inzzz). Compilation phase and execution phase. PDV and Descriptor Information · Name statements that are recognized at compile time only? . The PDV contains two automatic variables _N_ and _ERROR_. by aaa.· How would you delete observations with duplicate keys? NODUPKEY · How would you code a merge that will keep only the observations that have matches from both sets.

Set old.PUT · Name statements that are execution only. DATA. How do i convert a character variable to a numeric variable? You must create a differently-named variable using the INPUT function. since a simple sub setting IF statement can change the relationship between Observation number and the number of iterations of the data step. if mod(_n_. RUN.3)= 1 then. How do i convert a numeric variable to a character variable? You must create a differently-named variable using the PUT function. a new iteration of the DATA step begins.indicates the number of times SAS has looped through the data step.Identify statements whose placement in the DATA step is critical.and _ERROR_ variables are always available to you in the data step . Each time the DATA statement executes. INPUT· . Eg. INFILE. INPUT. · What is _n_? It is a Data counter variable in SAS. How can I compute the age of something? . Note: If we use a where clause to subset the _n_ will not yield the required result. Ex: This is nothing but a implicit variable created by SAS during data processing. Note: Both -N. run.variable ha a value of 1 if there is a error in the data for that observation and 0 if it is not.This is not necessarily equal to the observation number. If we want to find every third record in a Dataset thenwe can use the _n_ as follows Data new-sas-data-set. INPUT · In the flow of DATA step processing. · Name statements that function at both compile and execution time.The –ERROR. It is Available only for data step and not for PROCS.–N. and the _N_ automatic variable is incremented by 1. It gives the total number of records SAS has iterated in a dataset. what is the first action in a typical DATA Step? The DATA step begins with a DATA statement.

How can I compute the number of months between two dates? Given two sas date variables begin and end: months = intck('month'. a new line is read into the Input Buffer and INPUT attempts to fill the rest of the variables starting from column . How can I put my sas time variable with a leading zero for hours 1-9? Use a combination of the Z. formats to simply display the value: put sasdate year4. put hrprint z2.. INFILE OPTIONS Prepared by Sreeja E V(sreeja@kreara. Here. and MMSS.. when the INPUT statement reaches the end of non-blank characters without having filled all variables.'1234').string.2.1.Given two sas date variables born and calc: age = int(intck('month'..com.3) substr(string.n)). reorder = substr(string..use SUBSTR? You can do this using only one function call with TRANSLATE versus two functions calls with SUBSTR.scan(string. or use a combination of the PUT and COMPRESS functions to store the value: newvar = compress(put(sasdate. FLOWOVER FLOWOVER is the default option on INFILE statement. Infile has a number of options available.1). The following lines each move the first character of a 4-character string to the last: reorder = translate('2341'.blogspot.born.com) source: kreara. and MMDDYY. How can I put my sas date variable so that December 25. 1995 would appear as '19951225'? (with no separator) use a combination of the YEAR. sasdate mmddyy4. formats: hrprint = hour(sastime).begin. ':' sastime mmss5.calc) / 12). I need to reorder characters within a string.(day(end) <> How can I determine the position of the nth word within a character string? Use a combination of the INDEXW and SCAN functions:pos = indexw(string.'/').end) .yymmdd10.(day(born) > day(calc)). if month(born) = month(calc) then age = age .).

input id $ type $ amount. infile "External file" truncover.one. MISSOVER option on INFILE statement does not allow it to move to the next line. data B. infile "External file" flowover. TRUNCOVER will take as much as is there. MISSOVER option sets all the variables without values to missing. Difference between TRUNCOVER and MISSOVER Both will assign missing values to variables if the data line ends before the variable’s field starts. Consider the text file below containing a character variable chr. a bb ccc dddd eeeee ffffff Consider the following SAS code data trun. infile "External file" missover. Consider the following text file containing three variables id. Variables which are not assigned values are set to missing. which creates the following dataset TRUNCOVER Causes the INPUT statement to read variable-length records where some records are shorter than the INPUT statement expects. which creates the following dataset MISSOVERWhen INPUT reads a short line. run. 11101 A 11102 A 100 11103 B 43 11104 C 11105 C 67 The following SAS code uses the flowover option which reads the next non missing values for missing variables. type and amount. data B. input id $ type $ amount. run. The next time an INPUT statement is executed. whereas MISSOVER will assign the variable a missing value. a new line is brought into the Input Buffer. But when the data line ends in the middle of a variable field. .

Are you familiar with special input delimiters? How are they used? DLM. DSD are the special input delimiters… DELIMITER= delimiter(s) specifies an alternate delimiter (other than a blank) to be used for LIST input DSD (delimiter-sensitive data) specifies that when data values are enclosed in quotation marks. delimiters within the value be treated as character data. informats and length specifiers. . .input chr $3.com/onlinedoc/913/getDoc/en/lrdict. When you specify DSD.hlp/a000146932 . How do you read in the variables that you need? Using Input statement with column /line pointers. . run. MISSOVER prevents an INPUT statement from reading a new input data record if it does not find values in the current input line for all the variables in the statement. While using missover option we get the output What SAS statements would you code to read an external raw data file to a DATA step? We use SAS statements – FILENAME – to specify the location of the file INFILE – Identifies an external file to read with an INPUT statement INPUT – to specify the variables that the data is identified with. input chr $3.htm#a000177189 If reading a variable length file with fixed input. The DSD option changes how SAS treats delimiters when you use LIST input and sets the default delimiter to a comma. infile "External file" missover.sas. variables without any values assigned are set to missing. run. how would you prevent SAS from reading the next record if the last variable didn’t have a value? Options MISSOVER and TRUNCOVER options. SAS treats two consecutive delimiters as a missing value and removes quotation marks from character values http://support. When an INPUT statement reaches the end of the current input data record. TRUNCOVER overrides the default behavior of the INPUT statement when an input data record is shorter than the INPUT statement expects.. By default. When using truncover option we get the following dataset data miss.

com/onlinedoc/913/getDoc/en/lrdict. Wordatew.php How would you code the criteria to restrict the output to be produced? In view of in-sufficient clarity as to what the interviewer refers to – Global statement – options obs=. http://support.hlp/a000178212 . outobs= for SQL select Proc datasets – NOLIST option What is the purpose of the trailing @ and the @@? How would you use them? Line-hold specifiers keep the pointer on the current input record when a data record is read by more than one INPUT statement (trailing @) .• the INPUT statement automatically reads the next input data record.hlp/a000146932 . dollarw.lowcase Arithmetic functions – Sum / abs / Attribute info functions – Attrn / length Dataset – open / close / exist Directory – dexist / dopen / dclose / dcreate / dinfo File functions – fexist / fopen/ filename / fileref SQL functions – coalesce / count / sum/ mean Date functions – date / today / datdif / datepart / datetime / intck / mdy Array functions – dim http://sastechies. Dataset options – obs= Proc SQL – NOPRINT option for reporting / inobs= . http://support. TRUNCOVER enables you to read variable-length records when some records are shorter than the INPUT statement expects. INFORMAT Statement – Associates informats with variables It’s basically used in an input / SQL create table statements to read external file raw data or data that is not in a SAS format.htm Name and describe three SAS functions that you have used. Variables without any values assigned are set to missing.. mmddyyw.htm eg: commaw. FORMAT Statement Associates formats with variables It’s basically used in a datastep format / SQL select / Procedure format statements to output SAS data to a file/report etc Formats can look-like informats but are differentiated as to which statement they are used in… eg.htm#a000177189 What is the difference between an informat and a format? Name three informats or formats.com/SASfunctions.. Datew. http://support. Worddatew.php” title=”http://sastechies.sas. if any? The most common functions that would be used areConversion functions – Input / Put / int / ceil / floor Character functions – Scan / substr / index / Left / trim / compress / cat / catx / upcase.com/onlinedoc/913/getDoc/en/lrdict.com/onlinedoc/913/getDoc/en/lrdict.sas.sas. datew.hlp/a000178244 .com/ SASfunctions. $varyinglengthw.

No new record is read into the input buffer. an INPUT statement without a trailing @ executes the next iteration of the DATA step begins. When you use a trailing @. SAS releases the record that is held by a double trailing @ immediately if the pointer moves past the end of the input record immediately if a null INPUT statement executes: input. Normally. Then the input point er move s • 10 3 . when you use a double trailing @ (@@).• • • • • • • • • • • one input line has values for more than one observation (double trailing @) a record needs to be reread on the next iteration of the DATA step (double trailing @). each INPUT statement in a DATA step reads a new data record into the input buffer. Use a single trailing @ to allow the next INPUT statement to read from the same record. SAS releases a record held by a trailing @ when a null INPUT statement executes: input. The next INPUT statement for the same iteration of the DATA step continues to read the same record rather than a new one. when the next iteration of the DATA step begins if an INPUT statement with a single trailing @ executes later in the DATA step: input @. A record held by the double trailing at sign (@@) is not released until >—-+—-10–V+the input 10 9 7 point 2 2 8 er 84 23 36 75 move s past the end of the recor d. the following occurs: The pointer position does not change. the INPUT statement for the next iteration of the DATA step continues to read the same record. Normally. Use a double trailing @ to hold a record for the next INPUT statement across iterations of the DATA step.

• • 5. output. input ID $4. run.560.276. the single @ also releases a record when control returns to the top of the DATA step for the next iteration. if type='H' then input @3 Address $15. @. retain Address. infile census.934.908. run.323.people (drop=type).34 2. 3. 2. end.38 1009 2.12 3.176. @.81 data perm. • input ID $4. enables the next INPUT statement to read from the same record releases the current record when a subsequent INPUT statement executes without a line-hold specifier. @... input type $1.down to the next recor d. ment witho ut a linehold specif ier execu tes. do Quarter=1 to 4. @@.41 4. 4 34 85 65 0943 1. if type='P'.472. @15 Gender $1. @13 Age 3. an . data perm. INPUT .581.18 7. infile data97 missover.308. input Sales : comma.34 52 . Raw Data File Data97 >—-V—-10—+—-20—+—-30—+—-40 073 1. state input Department 5. input @3 Name $10. Unlike the @@.sales97..

An END statement ends a SELECT group. An optional OTHERWISE statement specifies a statement to be executed if no WHEN condition is met. infile census. if type='H' then do. @. MAIN P ST P MARY E 21 P F H WILLIAM M 23 P M P SUSAN K 3 P F P 324 S. Use at least one WHEN statement in a SELECT group. retain Address. input Address $ 3-17. if _n_ > 1 then output. MAIN ST MARGO K 27 F WILLIAM R 27 M P ROBERT W 1 M Under what circumstances would you code a SELECT construct instead of IF statements? The SELECT statement begins a SELECT group. SELECT groups contain WHEN statements that identify SAS statements that are executed when a particular condition is true. input type $1. >—-+—-10—+—-20 H 321 S. MAIN P ST H THOMAS H P 79 M P WALTER S 46 H M P ALICE A 42 F P MARYANN A 20 F JOHN S 16 M 325A S. . MAIN ST JAMES L 34 M LIZA A 31 F 325B S.residnts. end.>V—+—-10—+—H 321 S. Total=0. else if type='P' then total+1. MAIN ST P MARY E 21 F P WILLIAM M 23 P M SUSAN K 3 F data perm.

dat'. PUT Statement – Writes the variable values to the external file. Using a subsetting IF statement without a THEN clause could be dangerous because it would process only those records that meet the condition specified in the IF clause. The FILE statement specifies the current output file for PUT statements in the DATA step. Null statements that are used in OTHERWISE statements prevent SAS from issuing an error message when all WHEN conditions are false. put _infile_. input some. data _null_. not a SAS data library. Using Select-When improves processing efficiency and understandability in programs that needed to check a series of conditions for the same variable. filename cool1 'c:\cool1. Q.chemists. Use IF-THEN/ELSE statements for programs with few statements. SCAN B. what is the shortcut to write that record without coding every single variable on the record? Use the _infile_ option in the put statement filename some 'c:\cool. file cool1. ATTRIB C. run. and it must be a valid access type.Null statements that are used in WHEN statements cause SAS to recognize a condition as true without taking further action.htm What statement you code to tell SAS that it is to write to an external file? FILENAME / FILE/ PUT The FILENAME statement is an optional statement that species the location of the external file. the PUT statement builds and writes output lines to the file that was specified in the most recent FILE statement. If reading an external file to produce an external file.sas.dat'. . data test.com/onlinedoc/913/getDoc/en/lrdict. When multiple FILE statements are present. infile some. FORMAT D. PUT E. the PUT statement writes to the SAS log. set sasdata. If no FILE statement was specified.hlp/a000201966 . Which SAS statement below will change the characteristics of a variable if it was used in a data step? A. http://support. The specified output file must be an external file. ARRAY (2)The following SAS program is submitted: libname sasdata ‘SAS-data-library’.

predictive Modeling or interview question . Neural Network d. chem3 B. Unknown D.COLUMN. c.In which method .what is the default length? Q2. AutoNeural b. The Input Data tool b. every day in this section and to know the answer please send a email to qa@iisastr. else description = ‘Unknown’. the highest value indicates the best fit. We put a question of base sas . The Filter tool c. Unless a profit matrix is defined. The Explore window (2)Which SAS Enterprise Miner tool can be used to automatically explore alternative network architectures and hidden unit counts? a. d. For all fit statistics that the Model Comparison tool generates.advance sas .if jobcode = ‘chem3′ then description = ‘Senior Chemist’.will you get a warning message when combining common variables from data sets unless you select variables from individual data sets? . Senior Chemist C. such as extreme outliers.com with subject line question no .When reading data . also you may send any question to get answer. Rule Induction Q. the Model Comparison tool selects the model with the smallest validation misclassification rate by default. or Formattedt Input statement? Q3. run.Merge with By statement or SQL procedure. A value for the variable JOBCODE is listed below: JOBCODE CHEM3 Which one of the following values does the variable DESCRIPTION contain? A. The Data Partition tool d. can the trailing @ control be used in the LIST.if you do not specify the length of a varibles . The Model Comparison tool appears on the Explore tab. b. ‘ ‘ (missing character value) Predictive Modeling Certification Question: (1)Which SAS Enterprise Miner tool would you use to exclude certain observations in your data source. from your analysis? a.(3)Which of the following statements about assessing model performance using the Model Comparison tool is true? a. DMNeural c. The Model Comparison tool calculates values for up to three statistics at a time. Interview Question: Q1.

Z = MOD( INT( A/100 ).it will create a input buffer to store the values of the observation? How might you use MOD and INT on numeric to mimic SUBSTR on character Strings? A) The first argument to the MOD function is a numeric. if age>25 then drop sex. The INT function takes only one argument and returns the integer portion of an argument. DATA NEW . 100 ) . Do you think that . the second is a non-zero numeric. When we use Dim function we would have to re –specify the stop value of an iterative DO statement if u change the dimension of the array. please answer with reason? Q5. during the execution phase of the above program . A = 123456 . Do you think above program will run without any error or not. 1000 ) .Q4. run. truncating the decimal portion. set iisastr_delhi. data iisastr. what does the DIM function do? A) DIM: It is used to return the number of elements in the array. PUT A= X= Y= Z= . the result is the remainder when the integer quotient of argument-1 is divided by argument-2. Y = MOD( A. X = INT( A/1000 ) . Note that the argument can be an expression. run. How would you determine the number of missing or nonmissing values in computations? . RUN . Can a where statement be applied to DATA steps with an INPUT statement? Q6.Ms Lily has written the following code . Mr Raj sas programmer has written the following code data iisastr. Result: A=123456 X=123 Y=456 Z=34 In ARRAY processing.

This function simply returns 0 if there aren't any or 1 if there are missing values.field2. m=.If you need to know how many missing values you have then use num_missing=NMISS(field1. Which one is appropriate depends upon your needs. "dd mon ccyy" if it's after 1985. b. you can often use shortcuts in writing the field names If your fields are not numbered sequentially but are stored in the program data vector together then you can use: total=SUM(of fielda--zfield).However. y.field3).field3). The above program results in N = 2 (Number of non missing values) and NMISS = 1 (number of missing values). and as 'Disco Years' if it's between 1975 and 1985.A) To determine the number of missing values that are excluded in a computation. c . z). First. How would you accomplish this in data step code? Using only PROC FORMAT.There is a field containing a date. If you have more than a couple fields. run. z=0.field3). . z). It needs to be displayed in the format "ddmonyy" if it's before 1975.d). y=4. N = N(m . Just make sure you remember the “of” and the double dashes or your code will run but you won’t get your intended results. you will get a missing value for the result if any of the fields are missing. use the NMISS function. data _null_. and x=SUM (of a. there is an advantage to use the SUM function even if you want the results to be missing. You can also find the number of non-missing values with non_missing=N (field1. What is the difference between: x=a+b+c+d.. If you choose addition.? A) Is anyone wondering why you wouldn’t just use total=field1+field2+field3. Do you need to know if there are any missing values? A) Just use: missing_values=MISSING(field1. data new .field2.. input date ddmmyy10. Mean is another function where the function will calculate differently than the writing out the formula if you have missing values. y. how do you want missing values handled? The SUM function returns the sum of non-missing values. NMISS = NMISS (m .field2.

format date dat. What are some differences between PROC SUMMARY and PROC MEANS? Proc means by default give you the output in the output window and you can stop this by the option NOPRINT and can take the output in the separate file by the statement OUTPUTOUT= . value dat low-'01jan1975'd=ddmmyy10. if x=. Std deviation. .3333 then put 'fraction'.cards. what is needed for 'fraction' to print to the log? data _null_. run. we have to explicitly give the output statement and then print the data by giving PRINT option to see the result. run. run. proc print. In the following DATA step. Where as Mean function compute only the mean values.'01jan1975'd-'01JAN1985'd="Disco Years"' 01JAN1985'd-high=date9. But. 01/05/1955 01/09/1970 01/12/1975 19/10/1979 25/10/1982 10/10/1988 27/12/1991 . Mean. run. Minimum and maximum. x=1/3. What is a problem with merging two data sets that have variables with the same name but different data? A) Understanding the basic algorithm of MERGE will help you understand how the stepProcesses. proc summary doesn't give the default output. What is the difference between calculating the 'mean' using the mean function and PROC MEANS? A) By default Proc Means calculate the summary statistics like N. There are still a few common scenarios whose results sometimes catch users off guard. Here are a few of the most frequent 'gotchas': . proc format ..

What techniques and/or PROCs do you use for tables? A) Proc Freq. How do the IN= variables improve the capability of a MERGE? A) The IN=variablesWhat if you want to keep in the output data set of a merge only the matches (only those observations to which both input data sets contribute)? SAS will set up for you special temporary variables.BY variables has different lengthsIt is possible to perform a MERGE when the lengths of the BY variables are different.merge one(in=x) two(in=y). theShorter length will be used for the length of the BY variable during the merge. If it is not completely clear when MERGE and IF-THEN can be used in one data step and when it should not be. so that you can do this and more. Proc univariate.But if the data set with the shorter version is listed first on the MERGE statement. To prevent the warning. /* sets one and two respectively */run. Truncation can be avoided by naming the data set with the longest length for the BY variable first on the MERGE statement. ensure the BY variables have the same length prior to combining them in the MERGE step with PROC CONTENTS. Do you prefer PROC REPORT or PROC TABULATE? Why? A) I prefer to use Proc report until I have to create cross tabulation tables.In Version 8. The warning will be issued regardless of which data set is listed first:WARNING: Multiple lengths were specified for the BY variable name by input data sets. Here's what you have to do: signal to SAS on the MERGE statement that you need the IN= variables for the input data set(s) use the IN= variables in the data step appropriately. ask for the IN= variables and use them:data three. Which data set is the controlling data set in the MERGE statement? A) Dataset having the less number of observations control the data set in the merge statement. truncation occurs and unintended combinations could result.This may cause unexpected results. Proc Tabulate & Proc Report. It gives me so many options to modify the look up of . or by recreating the data sets to have identical lengths for the BY variables. You can change the variable length with either a LENGTH statement in the merge DATA step prior to the MERGE statement. it will ensure an error-free merge result. By following the above recommendation. /* for the IN= variables for data */if x=1 and y=1. So to keep only the matches in the match-merge above. but the warning message is still issued. then it is best to simply always separate them in different data step. because. called the "IN=" variables.1. /* x & y are your choices of names */by id. a warning is issued to point out this data integrity risk.Note: When doing MERGE we should not have MERGE and IF-THEN statement in one data step if the IF-THEN statement involves two variables that come from two different merging data sets. Due to this shorter length.

The other advantages of Data NULL is when we submit.variable ha a value of 1 if there is a error in the data for that observation and 0 if it is not. It is also used to create the macro variables in the data set. because it produces a complete machine language program. What is the effect of the OPTIONS statement ERRORS=1? A) The –ERROR. It’s a Data step that generates a report without creating the dataset there by development time can be saved. Compiled code does the work much more efficiently. and then ultimately translates it into object code or machine language. How experienced are you with customized reporting and use of DATA _NULL_ features? A) I have very good experience in creating customized reports as well as with Data _NULL_ step. by this we can change the width of each column in the table) Where as Proc tabulate unable to produce some of the things in my table. What is the one statement to set the criteria of data that can be codedin any step? A) Options statement. if there is any compilation error is there in the statement which can be detected and written to the log there by error can be detected by checking the log after submitting it. Ex: tabulate doesn’t produce n (%) in the desirable format.A4 ? .A4 and VAR A1 -. (ex: Width option.otherwise fail. What's the difference between VAR A1 . which can then be executed. Under what circumstances would you code a SELECT construct instead of IF statements? A: I think Select statement are used when you are using one conditionto compare with several conditions likeselect passwhen Physics >60when math > 100when English = 50. and then executes those instructions immediately. What is the difference between compiler and interpreter? Give any one example (software product) that act as an interpreter? A) Both are similar as they achieve similar purposes. Compiled code takes programs (source) written in SAS programming language. The interpreter translates instructions one at a time.my table. What is the difference between nodup and nodupkey options? A) NODUP compares all the variables in our dataset while NODUPKEY compares just the BY variables. but inherently different as to how they achieve that purpose.

A) Compile What does the RUN statement do? a) When SAS editor looks at Run it starts compiling the data or proc step. DATA and RUN… Does SAS 'Translate' (compile) or does it 'Interpret'? Explain. How do you control the number of observations and/or variables read or written? FIRSTOBS and OBS optionApproximately what date is represented by the SAS date value of 730? 31st December 1961 Identify statements whose placement in the DATA step is critical.A: There is no diff between VAR A1-A4 an VAR A1—A4. Why is SAS considered self-documenting? A) SAS is considered self documenting because during the compilation time it creates and stores all the information about the data set like the time and date of the data set creation later No. Where as If u submit VAR A1---A4 instead of VAR A1-A4 or VAR A1—A3. if you have more than one data step or proc step or if you have a proc step Following the data step then you can avoid the usage of the run statement. of the variables later labels all that kind of info inside the dataset and you can look at that infousing proc contents procedure. A: INPUT. SAS cannot detect an end-of-file condition as it would if the file were being read sequentially. u will see error message in the log. What do the SAS log messages "numeric values have been converted to character" mean? What are the implications? It implies that automatic conversion took place to make character functions possible Why is a STOP statement needed for the POINT= option on a SET statement? Because POINT= reads only the specified observations. .

Format Tables.. How would you combine 3 or more tables with different structures? A) I think sort them with common variables and use merge statement.Match Merging. can use firstobs = and obs = . or Last.. May be more ..VAR to the BY groupvariable on unsorted data? A) In Unsorted data you can't use First..... show how youwould do this using arrays and with PROC TRANSPOSE? A) I would use TRANSPOSE if the variables are less use arrays if the var are more ..... Proc means or some times proc print to look how the data looks like .VAR and last. ETSBriefly describe 5 ways to do a "table lookup" in SAS. I am not sure what you mean different structures...... SAS 7 and 6... STAT. What other SAS features do you use for error trapping and datavalidation? A) Check the Log and for data validation things like Proc Freq...... If you were told to create many records from one record.. How do you debug and test your SAS programs? A) First thing is look into Log for errors or warning or NOTE in some cases or use the debugger in SAS data step. GRAPH. depends What is a method for assigning first....12 .What are some good SAS programming practices for processing very large data sets? A) Sort them once.2 in Windows and UNIX... Direct Access... PROC SQL What versions of SAS have you used (on which platforms)? SAS 8. Arrays.. What areas of SAS are you most interested in? BASE. What is the different between functions and PROCs that calculate the same simple descriptive statistics? A)Functions can used inside the data step and on the same data set but with proc's you can create a new data sets to output the results..

commenting the Lines. Functions usually affect the existing datasets. Use Data Null What are some problems you might encounter in processing missing values? In Data steps? Arithmetic? Comparisons? Functions? Classifying data? The result of any operation with missing value will result in missing value. How would you create multiple observations from a single observation? Using double Trailing @@ For what purpose would you use the RETAIN statement? The retain statement is used to hold the values of variables across iterations of the data step. Normally. What are _numeric_ and _character_ and what do they do? Will either read or writes all numeric and character variables in dataset. What is the order of evaluation of the comparison operators: . If you were told to create many records from one record. How would you create a data set with 1 observation and 30 variables from a data set with 30 observations and 1 variable? Using PROC TRANSPOSE What is the different between functions and PROCs that calculate the same simple descriptive statistics? Proc can be used with wider scope and the results can be sent to a different dataset. Most SAS statistical procedures exclude observations with any missing variable values from an analysis. show how you would do this using array and with PROC TRANSPOSE? Declare array for number of variables in the record and then used Do loopProc Transpose with VAR statement.What are some good SAS programming practices for processing very large data sets? Sampling method using OBS option or subsetting. all variables in the data step are set to missing at the start of each iteration of the data step.

*. **. what is the first action in a typical DATA Step? When you submit a DATA step. Sort order treats missing as second smallest followed by underscore. How could you generate test data with no input data?Using Data Null and put statementHow do you debug and test your SAS programs? Using Obs=0 and systems options to trace the program execution in log. functions. /. What is the purpose of _error_? It has only to values.( creation of input buffer and . +. Where and SelectHow are numeric and character missing values represented internally? Character as Blank or “ and Numeric as.* / ** ()?(). an update. Which date functions advances a date time or date/time value by a given interval? INTNX. It will also display the error with line number so that you can and edit the program. formats. How do you test for missing values? Using Subset functions like IF then Else. which are 1 for error and 0 for no error How can you put a "trace" in your program? By using ODS TRACE ON How does SAS handle missing values in: assignment statements. a merge. sort order. PROCs? Missing values will be assigned as missing in Assignment statement. In the flow of DATA step processing.+ . SAS processes the DATA step and then creates a new SAS data set. What can you learn from the SAS log when debugging? It will display the execution of whole program and the logic.

3? The main advantage of version 9 is faster execution of applications and centralized access of data and support.CALL SYMPUTX Macro statement is added in the version 9 which creates a macro variable at execution time in the data step by ·Trimming trailing blanks · Automatically converting numeric value to character.What is the purpose of using the N=PS option?The N=PS option creates a buffer in memory which is large enough to store PAGESIZE (PS) lines and enables a page to be formatted randomly prior to it being printed. What is the one statement to set the criteria of data that can be coded in any step? OPTIONS Statement. ·ANYDTDTEW. Length for numeric informat in version 9 is 31. SAS relies on a new .3 new informats are available in version 9 to convert various date. Length for Character names in version 9 is 31 where as in version 8 is 32. Length for Numeric format allowed in version 9 is 32 where as 8 in version 8. SAS/Connect only use Server connection. Label statement.Converts to a SAS date value ·ANYDTTMEW. because it will eliminate the duplicate values. . -Converts to a SAS datetime value. 8 in version 8. 32 in version 8.Converts to a SAS time value. SAS9. Keep / Drop statements.Length for character names is 30.1. time and datetime forms of data into a SAS date or SAS time. .WHAT DIFFERRENCE DID YOU FIND AMONG VERSION 6 8 AND 9 OF SAS. In the SAS 9 architecture. Ms-Access etc. The SAS 9 Architecture is fundamentally different from any prior version of SAS. New ODS option (COLUMN OPTION) is included to create a multiple columns in the output. ·ANYDTDTMW..PDV)Compilation PhaseExecution Phase What are SAS/ACCESS and SAS/CONNECT? SAS/Access only process through the databases like Oracle. What are the new features included in the new version of SAS i. What are the scrubbing procedures in SAS? Proc Sort with nodupkey option. The following are the few:SAS version 9 supports Formats longer than 8 bytes & is not possible with version 8.There are lots of changes has been made in the version 9 when we compared with the version 8. SQLserver.e.

Not using debugging techniques and not using Fsview option vigorously. What has been your most common programming mistake? Missing semicolon and not checking log after submitting program. Name several ways to achieve efficiency in your program. a9). ·CPU time ·Data Storage · Elapsed time · Input/Output · Memory CPU Time and Elapsed Time. If don’t use the OF function it might not be interpreted as we expect.component. Proc Univariate etc. Proc Report. Explain trade-offs. Proc tabulate. such as security permissions for SAS libraries and where the various SAS servers are running. It is true for mean option also. .Base line measurements Few Examples for efficiency violations: Retaining unwanted datasets Not sub setting early to eliminate unwanted records. Using of length statements to reduce the variable size for reducing the Data storage. Efficiency improving techniques: Using KEEP and DROP statements to retain necessary variables.Use of Data _NULL_ steps for processing null data sets for Data storage. a6. For example the function above calculates the sum of a1 minus a4 plus a6 and a9 and not the whole sum of a1 to a4 & a6 and a9. Use SQL procedure to reduce number of programming steps. Proc Means. the Metadata Server. Proc freq and Proc print. are maintained in a common repository. Use macros for reducing the code. Efficiency and performance strategies can be classified into 5 different areas. Using IF-THEN/ELSE statements to process data programming. to provide an information layer between the programs and the data they access. What other SAS products have you used and consider yourself proficient in using? Data _NULL_ statement. Metadata. What is the significance of the 'OF' in X=SUM (OF a1-a4.

Ex: INTNX(interval. time. informat)For PUT: PUT (source. If we omit the INPUT or the PUT function during the data conversion. time or datetime value by a given interval? INTNX: INTNX function advances a date.alignment) INTCK: INTCK(interval.end-of-period) is an interval functioncounts the number of intervals between two give SAS dates.edate. INT: It returns the integer portion of a numeric value truncating the decimal portion. the function returns the reminder after numeric value divided by modulo. format)Note that INPUT function requires INFORMAT and PUT function requires FORMAT. and returns a date. or datetime value by a given interval. the function replaces the contents of the character variable. DATDIF (sdate. DATETIME () returns the current date and time of day. It is used in the INFILE statement.start-of-period. Which date function advances a date. SUBSTR: extracts a sub string and replaces character values. Time and/or datetime. If SUBSTR function is on the left side of a statement. or datetime value.start-from.number-of-increments.1). PAD: it pads each record with blanks so that all data lines have the same length. time. SAS will detect the mismatched variables and will try an automatic character-to-numeric or numeric-to-character conversion. removes leading and trailing blanks and inserts separators.What do the PUT and INPUT functions do? INPUT function converts character data values to numeric values. But sometimes this doesn’t work because $ sign prevents such conversion. Replacing character values: substr (phone. EX: for INPUT: INPUT (source.TRIM: trims the trailing blanks from the character values. SUBSTR: SCAN extracts words within a value that is marked by delimiters.basis): returns the number of days between two dates.3)=’433’. CATX: concatenate character strings.1. Scan function assigns a length of 200 to each target variable. SUBSTR extracts a portion of the value by stating the specific location. PUT function converts numeric values to character values.1. SCAN vs. . Therefore it is always advisable to include INPUT and PUT functions in your programs when conversions occur. What do the MOD and INT function do? What do the PAD and DIM functions do? MOD: Modulo is a constant or numeric variable. SCAN: it returns a specified word from a character value.Extraction of a substring: Middleinitial=substr(middlename. It is useful only when missing data occurs at the end of the record.

RUN . the result is the remainder when the integer quotient of argument-1 is divided by argument-2. z). PUT A= X= Y= Z= . y.field2. run.It is best used when we know the exact position of the sub string to extract from a character value. When we use Dim function we would have to re –specify the stop value of an iterative DO statement if u change the dimension of the array. N = N(m . 1000 ) . y. y=4. The INT function takes only one argument and returns the integer portion of an argument. Y = MOD( A. Note that the argument can be an expression. The above program results in N = 2 (Number of non missing values) and NMISS = 1 (number of missing values). z=0. 100 ) . Z = MOD( INT( A/100 ). what does the DIM function do? DIM: It is used to return the number of elements in the array. A=123456 X=123 Y=456 Z=34 In ARRAY processing. z). data _null_. NMISS = NMISS (m .field3). DATA NEW . How might you use MOD and INT on numeric to mimic SUBSTR on character Strings? The first argument to the MOD function is a numeric. A = 123456 . truncating the decimal portion. m=. This function .. the second is a non-zero numeric. use the NMISS function. X = INT( A/1000 ) . How would you determine the number of missing or nonmissing values in computations? To determine the number of missing values that are excluded in a computation. Do you need to know if there are any missing values? Just use: missing_values=MISSING(field1.

Which one is appropriate depends upon your needs. If you have more than a couple fields. run. . How would you accomplish this in data step code? Using only PROC FORMAT.. cards. If you choose addition. you will get a missing value for the result if any of the fields are missing. run. What is the difference between: x=a+b+c+d. data new . "dd mon ccyy" if it's after 1985.? Is anyone wondering why you wouldn’t just use total=field1+field2+field3.field3). and x=SUM (of a.field3). proc print.simply returns 0 if there aren't any or 1 if there are missing values. input date ddmmyy10. First. how do you want missing values handled? The SUM function returns the sum of non-missing values. and as 'Disco Years' if it's between 1975 and 1985. proc format . Just make sure you remember the “of” and the double dashes or your code will run but you won’t get your intended results. However. If you need to know how many missing values you have then use num_missing=NMISS(field1. There is a field containing a date. you can often use shortcuts in writing the field names If your fields are not numbered sequentially but are stored in the program data vector together then you can use: total=SUM(of fielda--zfield). c . '01jan1975'd-'01JAN1985'd="Disco Years" '01JAN1985'd-high=date9. Mean is another function where the function will calculate differently than the writing out the formula if you have missing values. It needs to be displayed in the format "ddmonyy" if it's before 1975. 01/05/1955 01/09/1970 01/12/1975 19/10/1979 25/10/1982 10/10/1988 27/12/1991 . b.field2.field2. You can also find the number of non-missing values with non_missing=N (field1. value dat low-'01jan1975'd=ddmmyy10..d). there is an advantage to use the SUM function even if you want the results to be missing.

Here are a few of the most frequent 'gotchas': 1. a warning is issued to point out this data integrity risk. What is a problem with merging two data sets that have variables with the same name but different data? Understanding the basic algorithm of MERGE will help you understand how the stepProcesses. run. theShorter length will be used for the length of the BY variable during the merge. Where as Mean function compute only the mean values. Truncation can be avoided by naming the data set with the longest length for the BY .This may cause unexpected results. In the following DATA step. we have to explicitly give the output statement and then print the data by giving PRINT option to see the result. proc summary doesn't give the default output. Mean. What are some differences between PROC SUMMARY and PROC MEANS? Proc means by default give you the output in the output window and you can stop this by the option NOPRINT and can take the output in the separate file by the statement OUTPUTOUT= . In Version 8. There are still a few common scenarios whose results sometimes catch users off guard. truncation occurs and unintended combinations could result. Due to this shorter length. The warning will be issued regardless of which data set is listed first: WARNING: Multiple lengths were specified for the BY variable name by input data sets. Std deviation. run.3333 then put 'fraction'. But if the data set with the shorter version is listed first on the MERGE statement. . x=1/3. What is the difference between calculating the 'mean' using the mean function and PROC MEANS? By default Proc Means calculate the summary statistics like N. if x=.format date dat.BY variables has different lengthsIt is possible to perform a MERGE when the lengths of the BY variables are different. But. Minimum and maximum. what is needed for 'fraction' to print to the log? data _null_.

by this we can change the width of each . By following the above recommendation. so that you can do this and more. How do the IN= variables improve the capability of a MERGE? The IN=variables What if you want to keep in the output data set of a merge only the matches (only those observations to which both input data sets contribute)? SAS will set up for you special temporary variables. Which data set is the controlling data set in the MERGE statement? Dataset having the less number of observations control the data set in the merge statement. ensure the BY variables have the same length prior to combining them in the MERGE step with PROC CONTENTS. Note: When doing MERGE we should not have MERGE and IF-THEN statement in one data step if the IF-THEN statement involves two variables that come from two different merging data sets. If it is not completely clear when MERGE and IF-THEN can be used in one data step and when it should not be. called the "IN=" variables. because. merge one(in=x) two(in=y). /* for the IN= variables for data */ if x=1 and y=1. Here's what you have to do: signal to SAS on the MERGE statement that you need the IN= variables for the input data set(s) use the IN= variables in the data step appropriately.variable first on the MERGE statement. or by recreating the data sets to have identical lengths for the BY variables. To prevent the warning. It gives me so many options to modify the look up of my table. /* x & y are your choices of names */ by id. Proc Tabulate & Proc Report. Proc univariate. but the warning message is still issued. You can change the variable length with either a LENGTH statement in the merge DATA step prior to the MERGE statement. ask for the IN= variables and use them: data three. (ex: Width option. /* sets one and two respectively */ run. What techniques and/or PROCs do you use for tables? Proc Freq. So to keep only the matches in the match-merge above. Do you prefer PROC REPORT or PROC TABULATE? Why? I prefer to use Proc report until I have to create cross tabulation tables. then it is best to simply always separate them in different data step. it will ensure an error-free merge result.

It is also used to create the macro variables in the data set.column in the table) Where as Proc tabulate unable to produce some of the things in my table. . How experienced are you with customized reporting and use of DATA _NULL_ features? I have very good experience in creating customized reports as well as with Data _NULL_ step. if there is any compilation error is there in the statement which can be detected and written to the log there by error can be detected by checking the log after submitting it. and then executes those instructions immediately. but inherently different as to how they achieve that purpose. The interpreter translates instructions one at a time. Compiled code takes programs (source) written in SAS programming language. The other advantages of Data NULL is when we submit. Label is global and rename is local i. which can then be executed. old name will be lost but if we label a variable its short name (old name) exists along with its descriptive name.dataset.If we rename a variable. *here you can mention single variable of multiple variables seperated by space to get single frequency. Compiled code does the work much more efficiently. run. label statement can be used either in proc or data step where as rename should be used only in data step. Code the table’s statement for a single level frequency? Proc freq data=lib. Ex: tabulate doesn’t produce n (%) in the desirable format. It’s a Data step that generates a report without creating the dataset there by development time can be saved. What is the difference between compiler and interpreter? Give any one example (software product) that act as an interpreter? Both are similar as they achieve similar purposes. What is the difference between nodup and nodupkey options? NODUP compares all the variables in our dataset while NODUPKEY compares just the BY variables. and then ultimately translates it into object code or machine language.e. because it produces a complete machine language program. table var. 2.. What is the main difference between rename and label? 1.

firstobs. using this method result in the creation of an empty table. create table latha. obs. How can u create zero observation dataset? Creating a data set by using the like clause. merge one(in=x) two(in=y). by id. run.0) What other SAS features do you use for error trapping and data validation? What are the validation tools in SAS? For dataset: Data set name/debug Data set: name/stmtchk For macros: Options:mprint mlogic symbolgen. ex: proc sql. What are input dataset and output dataset options? Input data set options are obs. run. by id. where. Look at the following example. if x and y. drop. describe the link and any required statements used to either process the code or the . data three. or data three.emp like oracle. In this the like clause triggers the existing table structure to be copied to the new table. rename.emp.How can you put a "trace" in your program?ODS Trace ON. quit. if x=1 and y=1. reuse.Both input and output dataset options include keep.What is Enterprise Guide? What is the use of it? It is an approach to import text files with SAS (It comes free with Base SAS version 9. merge one(in=x) two(in=y). Have you ever-linked SAS code. in output data set options compress. How would you code a merge that will keep only the observations that have matches from both data sets? Using "IN" variable option. If so. first obs. ODS Trace OFF the trace records.

step itself? In the editor window we write %include 'path of the sas file'. create new variables. declare the variables. if it is with non-windowing environment no need to give run statement. then. How can u import . you must include a STOP statement to stop DATA step processing. PROC SQL can sort. which combines the functionality of data and proc steps.csv'out=sarathdbms=csv replace. proc print data=sarath. failure to substitute another means of ending the DATA step when you use POINT= can cause the DATA step to go into a continuous loop. run. The NODUPKEY option checks for and eliminates duplicate observations by variable values. proc import datafile='E:\age. Because POINT= reads only those observations that are specified in the DO statement. join (merge). run.getnames=yes. What is SAS GRAPH? SAS/GRAPH software creates and delivers accurate. SAS cannot read an end-of-file indicator as it would if the file were being read sequentially. high-impact visuals that enable decision makers to gain a quick understanding of critical business issues. is data merge. subset. To join files in PROC SQL it does not require to sort the data prior to merging. What is the use of Proc SQl? PROC SQL is a powerful tool in SAS. Why is a STOP statement needed for the point=option on a SET statement? When you use the POINT= option. which is must.CSV file in to SAS? tell Syntax? To create CSV file. we have to open notepad. summarize. and concatenate datasets. and print the results or create a new dataset all in one step! PROC SQL uses fewer resources when compard to that of data and proc steps. . programming logic that checks for an invalid value of the POINT= variable. or Both. What is the difference between nodup and nodupkey options? The NODUP option checks for and eliminates duplicate observations. Because reading an end-of-file indicator ends a DATA step automatically.

What is the one statement to set the criteria of data that can be coded in any step? WHERE statement can sets the criteria for any data set in a datastep or a proc step. Have you ever linked SAS code? If so.. put ‘Total Quarterly Sales: ‘ qtr1tot dollar12. Call symput(‘macvar’. • Creating a Custom Report creating quick macro variables with call symput routine Eg.htm . how would you code the data statement to prevent SAS from producing a set? Data _null_. data _null_.com/onlinedoc/913/getDoc/en/lrdict.sas. run. The second DATA step in this program produces a custom report and uses the _NULL_ keyword to execute the DATA step without creating a SAS data set: data sales. shoes 4344 3555 2666 housewares 3777 4888 7999 appliances 53111 7122 41333 . qtr1tot=jan+feb+mar. Run.If you’re not wanting any SAS output from a data step. SAS code could be linked using the GOTO or the Link statement. jan feb mar. datalines. Set somedata. Data _null_ is majorly used in • eg. _NULL_ – specifies that SAS does not create a data set when it executes the DATA step. Data _null_. set sales.dsnvariable).hlp/a00020194 9. input dept : $10. GOTO – http://support. describe the link and any required statements used to either process the code or the step itself.

LINK – http://support.sas.com/onlinedoc/913/getDoc/en/lrdict.hlp/a00020197 2.htm The difference between the LINK statement and the GO TO statement is in the action of a subsequent RETURN statement. A RETURN statement after a LINK statement returns execution to the statement that follows LINK. A RETURN statement after a GO TO statement returns execution to the beginning of the DATA step, unless a LINK statement precedes GO TO, in which case execution continues with the first statement after LINK. In addition, a LINK statement is usually used with an explicit RETURN statement, whereas a GO TO statement is often used without a RETURN statement. When your program executes a group of statements at several points in the program, using the LINK statement simplifies coding and makes program logic easier to follow. If your program executes a group of statements at only one point in the program, using DO-group logic rather than LINK-RETURN logic is simpler. Goto eg. data info; input x; if 1<=x<=5 then go to add; put x=; add: sumx+x; datalines; 7 6 323 ; Link Eg. data hydro; input type $ depth station $; /* link to label calcu: */ if type =’aluv’ then link calcu; date=today(); /* return to top of step */ return; calcu: if station=’site_1′ then elevatn=6650-depth; else if station=’site_2′ then elevatn=5500-depth; /* return to date=today(); */ return; datalines; aluv 523 site_1 uppa 234 site_2 aluv 666 site_2 …more data lines… ;

How would you include common or reuse code to be processed along with your statements? - Using SAS Macros. - Using a %include statement When looking for data contained in a character string of 150 bytes, which function is the best to locate that data: scan, index, or indexc? Index function – Searches a character expression for a string of characters SAS Statements a=’ABC.DEF (X=Y)’; b=’X=Y’; x=index(a,b); put x; Results

10

For learning purposes The INDEXC function searches for the first occurrence of any individual character that is present within the character string, whereas the INDEX function searches for the first occurrence of the character string as a pattern. b=’have a good day’; x=indexc(b,’pleasant’,'very’); put x; The INDEXW function searches for strings that are words, whereas the INDEX function searches for patterns as separate words or as parts of other words. INDEXC searches for any characters that are present in the excerpts. s=’asdf adog dog’; p=’dog ‘; x=indexw(s,p); put x; If you have a data set that contains 100 variables, but you need only five of those, what is the code to force SAS to use only those variables? Use KEEP= dataset option (data statement or set statement) or KEEP statement in a datastep. eg. Data fewdata (keep = var10 var11); Set fulldata (Keep= VAR1 VAR2 VAR3 VAR4 VAR5); Keep var6 var7; Run; Code a PROC SORT on a data set containing State, District and County as the primary variables, along with several numeric variables. Proc sort data= Dist_County;

By state district city; Run; How would you delete duplicate observations? noduprecs option in a Proc Sort. data cricket; input id country $9. score; cards; 1 australia 342 2 somerset 343 1 australia 342 2 somerset 341 ; run; proc sort data = cricket noduprecs; by id; run; Here in the example observation 1 and 3 are duplicate records….so Obs 1 is retained… How would you delete observations with duplicate keys? nodupkey option in a Proc Sort. proc sort data = cricket nodupkey; by id; run; In the above example Observation 1/ 3 and 2 / 4 have duplicate key (variable id) values i.e. 1 and 2 respectively…so observations 3 / 4 get deleted… How would you code a merge that will keep only the observations that have matches from both sets. data mergeddata; merge one(in=A) two(in=B); By ID; if A and B; run; How would you code a merge that will write the matches of both to one data set, the non-matches from the left-most data. Data one two three; Merge DSN1 (in=A) DSN2 (in=B); By ID; If A and B then output one; If A and not B then output two; If not A and B then output three; Run; What is the Program Data Vector (PDV)? What are its functions? PDV is a logical area in memory where SAS builds a data set, one observation at a time. When a program executes, SAS reads data values from the input buffer or creates them by executing SAS

SAS writes the values to a SAS data set as a single observation. During the compile phase. The _ERROR_ variable signals the occurrence of an error caused by the data during execution. The _N_ variab counts the number of times the DATA step begins to iterate. SAS does not write these variables to the output data set. The value of _ERROR_ is either 0 (indicating errors exist). and the number. automatically translates the statements into machine code. _N_ and _ERROR_. The _N_ variable counts the number of times the DATA step begins to iterate. At compile time when a SAS data set is read. the PDV contains two automatic variables. The value of _ERROR_ is either 0 (indicating no errors exist). (When the DATA step reads a SAS data set. SAS checks the syntax of the SAS statements and compiles them. The data values are assigned to the appropriate variables in the program data vector. the da and time that the data set was created. SAS creates the following three items: input buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement. including data set attributes and variable attributes. The Execution Phase . the name of the data set and its member type. When a program executes. Does SAS ‘Translate’ (compile) or does it ‘Interpret’? Explain. SAS reads the data directly into the program data vector.) program is a logical area in memory where SAS builds a data set.language statements. Along with data set variables and computed variables. that is. SAS reads data (PDV) values from the input buffer or creates them by executing SAS language statements. or 1 (indicating that one or more errors have occurred SAS does not write these variables to the output data set. what items are created? SAS compiles the code sent to the compiler. SAS writes the val to a SAS data set as a single observation. The data values are assigned to the appropri variables in the program data vector. When you submit a DATA step for execution. descriptor is information that SAS creates and maintains about each SAS data information set. names an data types (character or numeric) of the variables. Along with data set variables and computed variables. In this phase. From here. From here. _N_ and _ERROR_. SAS identifies the type and length of each new variable. The _ERROR_ variable signals the occurrence of an error caused by the data during execution. or 1 (indicating that one or more errors have occurred). It contains for example. the PDV contains two automatic variables. and determines whether a type conversion is necessary for each subsequent reference to a variable. one data vector observation at a time. Note that this buffer created only when the DATA step reads raw data.

return. . INPUT. a simple DATA step iterates once for each observation that is being created. reads the next record or observation. SET. keep. an output. DATA. attrib. The DATA step terminates when SAS encounters the end-of-file in a SAS data set or a raw data file. Output. MERGE. footnote In the flow of DATA step processing. a new iteration of the DATA step begins. rename. or UPDATE statement to read a record. . length. retain. What is _n_? The _N_ variable counts the number of times the DATA step begins to iterate. SAS reads a data record from a raw data file into the input buffer. and the values of variables created by INPUT and assignment statements are reset to missing in the program data vector. Each time the DATA statement executes. and the _N_ automatic variable is incremented by 1.INFORMAT. options. Note that variables that you read with a SET. format. SAS sets the newly created program variables to missing in the program data vector (PDV). INPUT. or it reads an observation from a SAS data set directly into the program data vector. where. or UPDATE statement are not reset to missing here. At the end of the statements. INFILE.WHERE. label. by. CARDS . You can use an INPUT.INFILE. MODIFY. the system automatically returns to the top of the DATA step. SAS writes an observation to the SAS data set. RUN. MERGE. for numeric values) Name statements that are recognized at compile time only? drop.LABEL. The flow of action in the Execution Phase of a simple DATA step is described as follows: The DATA step begins with a DATA statement. All the variables are assigned missing values (Blank for character. Each time the DATA statement executes.SELECT. and executes the subsequent programming statements for the current observation. array Name statements that are execution only. Call routines Identify statements whose placement in the DATA step is critical. title.• • • • • • • By default. what is the first action in a typical DATA Step? The DATA step begins with a DATA statement. and the _N_ automatic variable is incremented by 1. a new iteration of the DATA step begins. MODIFY. and reset occur automatically.FORMAT Name statements that function at both compile and execution time. SAS counts iteration. informat. SAS executes any subsequent programming statements for the current record.

It should be noted that _n_ does not necessarily equal the observation number in a dataset.It is one of the Automatic data step (and not proc’s) variables (the other one being _ERROR_) that SAS provides in a PDV. the data type of a variable cannot be changed in one data step. and rename the character variable to the numeric variable name. and rename the numeric variable to the character variable name.hlp/a000180357 .sas.com/onlinedoc/913/getDoc/en/lrdict. Note: You would receive a warning saying that the variable has already been defined as numeric. . Note: You would receive a warning saying that the variable has already been defined as character. the data type of a variable cannot be changed in one data step. but the data values could…One should create a new variable with data type numeric and assign the values of the character variable with a INPUT function.hlp/a000199354 .com/onlinedoc/913/getDoc/en/lrdict. drop the character variable. drop the numeric variable.htm#a000226452 How do I convert a character variable to a numeric variable? Practically. http://support.sas. How do I convert a numeric variable to a character variable? Practically. Eg.htm What SAS statements would you code to read an external raw data file to a DATA step? We use SAS statements – FILENAME – to specify the location of the file INFILE – Identifies an external file to read with an INPUT statement INPUT – to specify the variables that the data is identified with. but the data values could…One should create a new variable with data type character and assign the values of the numeric variable with a PUT function. http://support.

delimiters within the value be treated as character data.hlp/a000178244 ..sas.com/onlinedoc/913/getDoc/en/lrdict. dollarw. INFORMAT Statement – Associates informats with variables It’s basically used in an input / SQL create table statements to read external file raw data or data that is not in a SAS format. MISSOVER prevents an INPUT statement from reading a new input data record if it does not find values in the current input line for all the variables in the statement. $varyinglengthw.hlp/a000146932 . SAS treats two consecutive delimiters as a missing value and removes quotation marks from character values http://support.hlp/a000146932 . When an INPUT statement reaches the end of the current input data record.htm#a000177189 If reading a variable length file with fixed input. datew. the INPUT statement automatically reads the next input data record. http://support. how would you prevent SAS from reading the next record if the last variable didn’t have a value? Options MISSOVER and TRUNCOVER options. By default. TRUNCOVER overrides the default behavior of the INPUT statement when an input data record is shorter than the INPUT statement expects. http://support. Wordatew. TRUNCOVER enables you to read variable-length records when some records are shorter than the INPUT statement expects. When you specify DSD.sas. Variables without any values assigned are set to missing.htm#a000177189 What is the difference between an informat and a format? Name three informats or formats.com/onlinedoc/913/getDoc/en/lrdict. FORMAT Statement Associates formats with variables It’s basically used in a datastep format / SQL select / Procedure format statements to output SAS data to a file/report etc . informats and length specifiers.htm eg: commaw.sas. Are you familiar with special input delimiters? How are they used? DLM. variables without any values assigned are set to missing.com/onlinedoc/913/getDoc/en/lrdict. DSD are the special input delimiters… DELIMITER= delimiter(s) specifies an alternate delimiter (other than a blank) to be used for LIST input DSD (delimiter-sensitive data) specifies that when data values are enclosed in quotation marks.How do you read in the variables that you need? Using Input statement with column /line pointers. The DSD option changes how SAS treats delimiters when you use LIST input and sets the default delimiter to a comma.

com/ SASfunctions.php” title=”http://sastechies.lowcase Arithmetic functions – Sum / abs / Attribute info functions – Attrn / length Dataset – open / close / exist Directory – dexist / dopen / dclose / dcreate / dinfo File functions – fexist / fopen/ filename / fileref SQL functions – coalesce / count / sum/ mean Date functions – date / today / datdif / datepart / datetime / intck / mdy Array functions – dim http://sastechies. if any? The most common functions that would be used areConversion functions – Input / Put / int / ceil / floor Character functions – Scan / substr / index / Left / trim / compress / cat / catx / upcase. mmddyyw..com/SASfunctions. The next INPUT statement for the same iteration of the DATA step continues to read the same record rather than a new one.com/onlinedoc/913/getDoc/en/lrdict. When you use a trailing @. the following occurs: The pointer position does not change. outobs= for SQL select Proc datasets – NOLIST option What is the purpose of the trailing @ and the @@? How would you use them? Line-hold specifiers keep the pointer on the current input record when a data record is read by more than one INPUT statement (trailing @) one input line has values for more than one observation (double trailing @) a record needs to be reread on the next iteration of the DATA step (double trailing @). each INPUT statement in a DATA step reads a new data record into the input buffer. http://support.hlp/a000178212 .• • • • • • • Formats can look-like informats but are differentiated as to which statement they are used in… eg. Datew. Normally.php How would you code the criteria to restrict the output to be produced? In view of in-sufficient clarity as to what the interviewer refers to – Global statement – options obs=. Use a double trailing @ to hold a record for the next INPUT statement across iterations of the DATA step.sas. Worddatew. Dataset options – obs= Proc SQL – NOPRINT option for reporting / inobs= . No new record is read into the input buffer. Use a single trailing @ to allow the next INPUT statement to read from the same record.htm Name and describe three SAS functions that you have used. SAS releases a record held by a trailing @ when a null INPUT statement executes: ..

the INPUT statement for the next iteration of the DATA step continues to read the same record.• • • • • input. @@. . input Department 5. an INPUT statement without a trailing @ executes the next iteration of the DATA step begins. an INPUT state ment witho ut a linehold specif ier input ID $4. when the next iteration of the DATA step begins if an INPUT statement with a single trailing @ executes later in the DATA step: input @.. Normally. • 10 3 • . . A record held by the double trailing at sign (@@) is not released until >—-+—-10–V+the input 10 9 7 point 2 2 8 er 84 23 36 75 move s past the end of the recor d. when you use a double trailing @ (@@). Then the input point er move s down to the next recor d. SAS releases the record that is held by a double trailing @ immediately if the pointer moves past the end of the input record immediately if a null INPUT statement executes: input.

end. input @3 Name $10. if type='P'.• • execu tes. Unlike the @@. infile data97 missover. Raw Data File Data97 >—-V—-10—+—-20—+—-30—+—-40 073 1.34 52 .81 data perm.560. MAIN ST P MARY E 21 F P WILLIAM M 23 P M SUSAN K 3 F data perm. run.908.323.12 3. MAIN P ST P MARY E 21 P F H WILLIAM M 23 5. input type $1.18 7. @15 Gender $1.sales97. input type $1. infile census. enables the next INPUT statement to read from the same record releases the current record when a subsequent INPUT statement executes without a line-hold specifier..residnts.176. 3.276. @13 Age 3. if type='H' then input @3 Address $15. do Quarter=1 to 4. input Sales : comma. run. data perm.581. infile census.. 4 34 85 65 0943 1.people (drop=type). >—-+—-10—+—-20 H 321 S. retain Address. 2. input ID $4.472. output.934.34 2. @. @. the single @ also releases a record when control returns to the top of the DATA step for the next iteration. retain Address. >V—+—-10—+—H 321 S.41 4. @. @.308.38 1009 2.

else if type='P' then total+1.sas. Null statements that are used in OTHERWISE statements prevent SAS from issuing an error message when all WHEN conditions are false. An END statement ends a SELECT group. input Address $ 3-17. MAIN ST MARGO K 27 F WILLIAM R 27 M P ROBERT W 1 M Under what circumstances would you code a SELECT construct instead of IF statements? The SELECT statement begins a SELECT group.if type='H' then do. end.htm What statement you code to tell SAS that it is to write to an external file? . Use at least one WHEN statement in a SELECT group. An optional OTHERWISE statement specifies a statement to be executed if no WHEN condition is met.com/onlinedoc/913/getDoc/en/lrdict. if _n_ > 1 then output. Using a subsetting IF statement without a THEN clause could be dangerous because it would process only those records that meet the condition specified in the IF clause.hlp/a000201966 . http://support. SELECT groups contain WHEN statements that identify SAS statements that are executed when a particular condition is true. MAIN ST THOMAS H 79 M WALTER S 46 M ALICE A 42 F MARYANN A 20 F JOHN S 16 M 325A S. Total=0. Using Select-When improves processing efficiency and understandability in programs that needed to check a series of conditions for the same variable. Null statements that are used in WHEN statements cause SAS to recognize a condition as true without taking further action. Use IF-THEN/ELSE statements for programs with few statements. P P P P P H P P H P P M SUSAN K 3 F 324 S. MAIN ST JAMES L 34 M LIZA A 31 F 325B S.

dat'. the PUT statement writes to the SAS log. PUT Statement – Writes the variable values to the external file. If no FILE statement was specified. The FILE statement specifies the current output file for PUT statements in the DATA step.FILENAME / FILE/ PUT The FILENAME statement is an optional statement that species the location of the external file. If reading an external file to produce an external file. input some. and it must be a valid access type. tables name/out=hosp1(drop=percent). run. cards. the PUT statement builds and writes output lines to the file that was specified in the most recent FILE statement. Tanu Jan Tanoj Feb Tanu Apr Tanu Dec Arun Oct Kiran Nov Tarun Mar Tarun Apr Tarun May Tarun Dec . proc report. .dat'. Question Dataset below shows the hospital visit by patients during the entire year (2012) data hospital. proc sql Answer: proc freq data = hospital. data _null_. The specified output file must be an external file. not a SAS data library. run. run. Find out how many times each patient has visited the hospital in 201208-12 using proc freq. file cool1. When multiple FILE statements are present. put _infile_. filename cool1 'c:\cool1. input name $ month $. infile some. what is the shortcut to write that record without coding every single variable on the record? Use the _infile_ option in the put statement view source print? filename some 'c:\cool.

proc sql. proc sort data = class. run. Question: data class. sum(count) from hosp2 group by name. proc means data = class mean. proc report data = hosp2. select name. count=1. quit. Avg(marks) from class group by name. by name. select name. Manoj Sacience 94 Raj Science 86 Tanu Maths 76 Manoj Maths 45 Manoj English 65 Tanu English 76 Tanu Science 76 Raj Maths 66 Raj English 56 .data hosp2. set hospital. .33333 Raj 76 Tanu Use proc means to do the same… Answer: proc sql. cards. define name/group. run. Use SQL to produce : Sum as 204 Manoj 208 Raj 228 Tanu Avg Marks as: 68 Manoj 69. run. run. column name count. quit. input name $ subject $ marks.

show how you would do this using arrays and with PROC TRANSPOSE? 19. How would you combine 3 or more tables with different structures? 23. Approximately what date is represented by the SAS date value of 730? 13. What are some good SAS programming practices for processing very large data sets? 5. What is the different between functions and PROCs that calculate thesame simple descriptive statistics? 18. How would you create a data set with 1 observation and 30 variables from a data set with 30 observations and 1 variable? 7. How could you generate test data with no input data? 26. Identify statements whose placement in the DATA step is critical. Describe 5 ways to do a "table lookup" in SAS. What is a method for assigning first. What are some problems you might encounter in processing missing values? In Data steps? Arithmetic? Comparisons? Functions? Classifying data? 6. For what purpose would you use the RETAIN statement? 25. 1. by name. If you were told to create many records from one record. What can you learn from the SAS log when debugging? . How do you debug and test your SAS programs? 27. What versions of SAS have you used (on which platforms)? 4.VAR and last. How do you debug and test your SAS programs? 21. How do you control the number of observations and/or variables read or written? 12. run. 3.VAR to the BY groupvariable on unsorted data? 20. What does the RUN statement do? 15. What do the SAS log messages "numeric values have been converted to character" mean? What are the implications? 10. What areas of SAS are you most interested in? 2. What are some good SAS programming practices for processing very large data sets? 17. Why is SAS considered self-documenting? 16. What is the different between functions and PROCs that calculate the same simple descriptive statistics? 8. Why is a STOP statement needed for the POINT= option on a SET statement? 11. What are _numeric_ and _character_ and what do they do? 24. What other SAS features do you use for error trapping and data validation? 22. show how you would do this using array and with PROC TRANSPOSE? 9.var marks. 14. If you were told to create many records from one record.

3? 36. How can you put a "trace" in your program? 44. What is the purpose of using the N=PS option? 34. How might you use MOD and INT on numeric to mimic SUBSTR on character Strings? 39. In ARRAY processing. What are SAS/ACCESS and SAS/CONNECT? 32. What is the purpose of _error_? 29. What do the PUT and INPUT functions do? 37. How do you test for missing values? 30.e. In the flow of DATA step processing. what is the first action in a typical DATA Step? 31. What are input dataset and output dataset options? 45. What is the one statement to set the criteria of data that can be coded in any step? 33. What are the new features included in the new version of SAS i. what does the DIM function do? 40. What are the scrubbing procedures in SAS? 35. What are the validation tools in SAS? 43.28.. How would you determine the number of missing or nonmissing values in computations? 41. What is SAS GRAPH? . SAS9. Do you need to know if there are any missing values? 42.1. time or datetime value by a given interval? 38. Which date function advances a date.

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.