NESUG 2010

Pharmaceutical Applications

The 5 Most Important Clinical SAS® Programming Validation Steps Brian C. Shilling, Octagon Research Solutions, Wayne, PA The validation of a SAS programmer's work is of the utmost importance in the pharmaceutical industry. Because the industry is governed by federal laws, SAS programmers are bound by a very strict set of rules and regulations. Reporting accuracy is crucial as these data represent people and their lives. This presentation will give the 5 most important concepts of SAS programming validatio n that can be instantly applied to everyday programming efforts. Knowing the Regulations There are several sets of rules and regulations that are required to be followed while analyzing and presenting clinical data. Knowing this group of regulations is very important, and at times a legal requirement. HIPAA The Health Insurance Portability and Accountability Act was put into place in 1996 to provide rights and protection for participants and beneficiaries in group health plans. HIPAA has little to no impact on your day-to-day work as a programmer but it is important to understand that the law exists and to have a general idea of its purpose. In simple terms, HIPAA serves to protect the information about a subject’s identifying information. The Code of Federal Regulations Title 21 of the Code of Federal Regulations, or CFR, pertains to food and drugs. Chapter 1 pertains to those components and identifies the Food and Drug Administration (FDA) and the Department of Health and Human Services (DHHS). Part 11 of this regulation is what pertains to you as a programmer. This chapter specifically identifies electronic records and electronic signatures. There are numerous topics within Title 21 that directly (Part 11 and Part 820) or indirectly (Part 50) affect programming. While you don’t need to reach each of these, it is helpful to understand what parts of the clinical trial and programming process are driven by these rules. International Conference on Harmonisation of Technical Requirements In a global setting, it is important for all international parties involved in the drug development process to follow a standard set of definitions for similar concepts and a common understanding for how drugs should be developed, The International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH) is a global organization that provides these common definitions and guidelines and is often the source for standard values for certain data. Again, these requirements may not impact your programming responsibilities directly but they are part of the framework that built the studies and the specifications you work with regularly.

1

independent programming and peer review. This helps not only to validate the analysis data set but the original data that went into it as well. a very in-depth process. people’s lives could be at risk. if that output is incorrect. Methods There are generally two ways to approach validation of output: separate. Presenting Correct Information Important decisions on subject health care are made based on the output generated by programmers. it could be worse than useless – it could actually be harmful. Often it is as simple as being clear about the task you need to perform (e. diligent validation is the best way to detect and prevent errors Having a Plan It is critical to have a plan before you begin any task in programming. programming “data traps” in your code to print any cases that do not meet your assumptions is often helpful. accurate notations Output looks like the specification Output content matches the specification 2 . It is a justification of the means used to accomplish the outcome of the program and its accurate representation of the original data. It is essential that all data from a clinical trial are presented as accurately as possible so that decisions can be made appropriately. create a summary table). If the calculations in your program assume that data exists or that it is consistent. This justification requires more effort and granular detail to prove that the data manipulation and analyses were performed correctly and appropriately.NESUG 2010 Pharmaceutical Applications What is Validation Validation is defined as “an act. While there is constant pressure in the industry to produce output faster. This takes on many forms and involves a number of procedures. Unless timelines prohibit it. Documentation This validation list outlines the basic requirements with which to begin. process or instance of determining the degree of wellgroundedness or justifiability: being at once relevant and meaningful.…”1 Validation. This list will grow as different tasks and types of data are discussed. Careful. If incorrect information is presented as fact. is the act of proving that the outcome of a program accurately and effectively represents the original source clinical study data. Each method has it strengths and weaknesses and there is no one correct way. – – – – Clean Logs Adequate. Make the Code Do the Work One of the best ways to be efficient is to make the program itself do as much of the checking for you as possible. programming the data listing before programming the table gives you one more chance to look at the data.g. gathering all of the sources of information you’ll need to perform the task.

Even the most detailed specifications are no substitute for understanding what potential data issues to look out for and how data interrelate. In the process of creating analyses and summary reports in support of a CSR. it is important to understand what makes sense for each data type to ensure that the methods used to validate the program are appropriate and that the output created makes sense. PROC FREQ. it is the simple procedures that are most helpful for validation of data and logic. and MLOGIC Macros The general rule for truly efficient programming is to use macros only when they add significantly to the process. SYMBOLGEN. Familiarity with Data There are many types of data that are common across clinical trials. This means that your log should not only be free of errors. When validating programs tha t reference clinical trial data. PROC PRINT. This is the knowledge that employers are looking for when asking for pharmaceutical experience – they know that an in-depth understanding of the data is the critical element to effective programming and validation in the pharmaceutical industry. These data types usually consist of subject characteristics and indications of the subject’s general state of health at a given time. you will cut down on run times as well as creates a smaller. PROC MEANS. Logs One of the most important and simplest steps to making validation easier is to start with a clean log. analysis data sets are often created first to facilitate the creation of those reports. but it should be free of warnings and “helpful” notes Keeping/Dropping data By keeping only the variables that you need. SAS Options and Language Elements Use the SAS options to your advantage: MPRINT.NESUG 2010 Pharmaceutical Applications – Output content matches similar content on other output – Output content makes sense Validation Techniques The following are several techniques that can be employed to aid in the process of validating the programming used in clinical trial data reporting: Procedures There is a wide variety of procedures provided within SAS that help do everything from sorting data to generating complex statistical analysis. While most people tend to think of the more complex procedures and how to use them. more manageable set of data. Simple code that is used repeatedly throughout the many programs makes it more appropriate for macro usage. Once those 3 .

columns. the code itself needs to make sense and the log needs to be free of errors and warnings. As mentioned earlier. While the final output is the ultimate product being validated.SAS file) is an important step and the final code should meet the following criteria: 4 . Reporting and Statistics For all of the seeming variation in the data collected and the methods of collection in clinical trials. PDF. relevant medical history. etc. data in rows vs. It is good practice to review your own code after it is written to make sure that the comments make sense and are sufficient to explain what is being done. laboratory test results. Different types of data are expected to have different structures and the content is expected to behave in different ways. Pre-Output Validation Steps One of the key elements of the validation process is the review of SAS code and SAS logs. etc. Subject demographics. Before either of these approaches comes into play. In essence these analyses are the final product that will be added to the Clinical Trial Report (CTR) and used to make statements and conclusions about the safety and efficacy of the drug or device being studied. Regardless of the data being analyzed or reported and regardless of whether you are responsible for the production output or for validating someone else’s output. In many cases.NESUG 2010 Pharmaceutical Applications analysis data sets are created.). No matter what type of output you are creating or the data with which you create it. and many others are usually reported using the same summary statistics. there are generally two approaches to validation – independent programming and peer review. it is important to understand the data so you can validate the result accurately and completely. there are general principals that apply to validating different categories of output (tables. there are often additional data manipulations and summary statistics generated in the process of creating the final TLF output. the data being worked with needs to be considered in both the programming and validation process. figures). starting validation with the code and the log will increase the probability that the final product will be accurate and correct. The validation of these summary reports and data listings often constitutes the bulk of the validation effort done for a project. Throughout all of these tasks. the reports generated on much of these data are surprisingly similar. Regardless of whether you are reviewing your own code or your peer’s as part of the validation process. the programming involves reporting the summary statistics from PROC FREQ and/or PROC MEANS (PROC UNIVARIATE). listings. Regardless of whether you are manipulating data to create a permanently stored analysis data set or manipulating data just to create a report. Code Review It is critical that code be easy to read and contain enough comments to allow easy understanding of what is being done (and sometimes why). the programmer who is responsible for generating the final. “production” output must validate his or her own work. The techniques for summarizing the data and putting it together for the report are the same regardless of the output file type (text files.) or layout (portrait or landscape. RTF. code review (reviewing the . physical exam findings.

it may not be doing what you really intended. Any messages of concern can be cause for speculation around the accuracy of the final output. the final result is unexpected.NESUG 2010 Pharmaceutical Applications 1) Is the code readable and understandable? 2) Are there sufficient comments such that another programmer could read and understand what is happening? 3) Is there a logical and reasonable flow to the program? 4) Does the code make sense in relation to the specifications? Is it reasonable to assume that the final outcome being created from this code would be accurate? 5) Are there any logic flaws or weaknesses where the code could fail? 6) Does the code adhere to the company standards? It is important that the code be reviewed with the se questions in mind. messages that start with the following keywords: ERROR WARNING INFO: Character INFO: The variable NOTE: At least NOTE: Character NOTE: Division NOTE: Mathematical NOTE: Merge NOTE: Missing NOTE: NOSPOOL NOTE: Numeric NOTE: Variable While the warning and error messages are clear problems. but due to issues in the data or unexpected problems with the code logic. It is possible for code to execute with no notes or warnings. Prior to reviewing the final output. While SAS is able to continue processing the data. it will be less likely to have major issues with the final output and any minor issues will be much easier to trace. In many cases this will be evinced in the number of observations being different than expected. it is 5 . the notes listed above are often more subtle indications that the data or the code is not behaving as expected. It is critical to understand what each of these notes means and ensure that the results are what you intended. warnings and errors. This way. even if SAS handles the data correctly. it is recommended that you adjust your code such that these no tes do not appear. if code needs to be run in the future these notes will not be cause for concern. at the ve ry least. Log Review Log review is another important step in the validation process. You might want to consider looking for. In addition to simply checking for notes and warnings. Once you know the source of the note. Each and every SAS log should be scanned and reviewed for SAS notes. If the code itself is able to pass these criteria favorably. it is important to follow the number of observations from one step to another.

Other brand and product names are trademarks of their respective companies. summary statistics and other reports generated by programmers is absolutely critical in proving that the results are accurate. Please feel free to contact the author at: Brian C. high-quality product that accurately represents the clinical study data. the final output won’t make sense either.m-w. Careful and complete validation of clinical datasets. Chances are that if the number of observations doesn’t make sense.com SAS and all other SAS Institute Inc. in the USA and other countries. References 1 www. Conclusion The clear and accurate representation of clinical trial data is crucial.com/cgi-bin/dictionary?book=Dictionary&va=valid(Merriam-Webster’s Online Dictionary) Acknowledgements I would like to thank my employer Octagon Research Solutions Inc. 585 E. ® indicates USA registration. product or service names are registered trademarks or trademarks of SAS Institute Inc. 6 . Shilling Octagon Research Solutions Inc. Contact Information Your comments and questions are valued and encouraged. Swedesford Road Suite 200 Wayne.mail: bshilling@octagonresearch. for allowing me the time to develop and present this paper. I would like to thank my co-author Carol Matthews for helping me write our book and being able to share it with the industry. In addition. Using these methods to validate various types of data and output will enable you to confidently deliver a validated.NESUG 2010 Pharmaceutical Applications critical to review the log to make sure the number of observations flowing into and out of each data step or procedure makes logical sense. PA 19087 E.

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.