\u2022 Performance and Scalability Explained
\u2022 Verifying Site Performance and Scalability
\u2022 Verification Phases
\u2022 Common Pitfalls
Performance and scalability are top con- cerns cited by information technology (IT) managers as they prepare to deploy new Internet and intranet applications for their organizations. But often it is not appar- ent what the performance and scalabili- ty requirements for a Web site should be or how performance and scalability needs should be verified.
The ability of a Web system to deliver content to an ever-changing number of simultaneous Web users is one of its most important features. Building and delivering a Web system that can serv- ice its customers in a timely fashion is critical to the survival of a business Web site. A Web system thatfeels slow may give the customer a negative impression of the business and may give that customer cause to seek out a competing system. Beyond the subjec- tive feel of the system is the capability of the site to carry out the requests of the user in a timely fashion, with opti- mal use of system resources. This arti- cle will address the following topics:
\u2022 Evaluation strategy, which entails gathering data on the site\u2019s performance and scalability behavior
\u2022 Verification phases, or the steps required for conducting essential per- formance and scalability testing
Web system performance and scalabili- ty are closely related. As such, they must be measured together in order to obtain an accurate picture of the Web system\u2019s capability to service users under various conditions. To properly grasp the importance of scalability, it is first necessary to understand the impli- cations of performance on a Web sys- tem, and that requires understanding the terms that apply to Web system per- formance and scalability.
Performance: Web system performance can be described from two perspectives. To an end user, response time is the basic measure used to judge the quality of a Web site\u2019s performance. Administrators, on the other hand, are concerned not only with response time but also with the site\u2019s resource utiliza-
number of users increases, owing to higher levels of resource utilization on the system\u2019s servers and network. Response time can also be affected by factors not related to user load, such as database size and poor software imple- mentation. Web system end users typi- cally perceive response time to be the amount of time taken from the moment they click the mouse to the moment that a new Web page has been fully dis- played on the screen. Based on this per- ceived time, users may judge system performance as too sluggish.
Scalability: Web system scalability is the ability to add computing resources to a site in order to obtain acceptable or improved response time, stability, and
throughput under a particular load. In this context, load refers to the number of users accessing the site at the same time. As more users access the site, the site\u2019s servers will use more of their CPU, input/output (I/O), and memory resources to handle the load. Eventually, one or more of these resources will becomesaturated, mean- ing that the system cannot efficiently process all the requests and must force some to wait for processing. In most cases, the computer\u2019s CPU will be the first component to become saturated. The end result of a saturated server resource is an increase in response time. Scaling allows the site to cope with additional load by providing more resources to process requests. The tiers, or groups of servers, of a typical Web system can be scaled as follows (Microsoft 1999b):
\u2022 Vertical scaling is achieved by upgrading or replacing a server with more powerful components or an entirely new, larger server. Vertical scal- ing can usually be achieved with little or no software modification. Although minor server upgrades, such as a faster processor or more memory, are not very costly, the cost of large-scale system replacement can be astronomical.
\u2022 Horizontal scaling refers to the ability to add more servers to a Web system configuration. This usually requires the site\u2019s architecture to support such scal- ing, as software issues may arise when multiple machines are used.
\u2022 Functional scaling involves the sepa- ration of groups of functions, such as catalog browsing and purchasing opera- tions, onto different groups of servers, allowing for more accurate horizontal and vertical scaling techniques.
Performance and scalability require- ments are used to judge whether the site will perform properly under various conditions of load. These requirements are used as a basis for determining whether the site is capable of meeting the expectations of the system\u2019s cus- tomer base. Such requirements are also used to support scalability and cost
\u2022 Response time: A major performance gauge for a Web site. Many factors con- tribute to a Web site\u2019s response time, and some are outside the Web site\u2019s control, such as the speed of an Internet connection for a user. Because many Internet users use modems, the response time may need to be adjusted for modem speed. For example, an accept- able response time goal for a 56K modem user might be 6 seconds, where- as the goal for a user accessing the Internet via a T1 line might be 2 sec- onds.
\u2022 Required number of concurrent users: The ability to support a large number of concurrent users with little or no degra- dation in response time. Quantifying the appropriate number of users is difficult, but such measures can be derived by examining similar sites, performing market research, or, possibly, looking at existing non-Web products. When a Web site has already gone live, statistics can be obtained from the Web server logs to determine typical usage patterns.
\u2022Cost: The number of servers and the administration time required (Menasce and Almeida 1998, p. 68). When such costs are too high, architectural changes or component optimization must be considered.
\u2022 Normal versus peak: The effect of these figures on the three previous fac- tors. For example, a site may experi- ence a normal user load of 300 concur- rent users, but that load will at times climb as high as 1,000 users. With that load, it may be acceptable for the Web system to experience a small degrada- tion in response time.
\u2022 Degradation under stress: The specif- ic degradation that occurs when system load capacity is exceeded. For example, how many users get partial, or broken, pages under this condition? Measures collected might indicate, for example, that 5 percent of system users get incomplete Web pages under a load of 1,000 concurrent users versus 10 per- cent when load is increased to 1,500 simultaneous users. Additionally, the Web site\u2019s stability should be evaluated to make sure that server processes do not crash or corrupt data while under various levels of stress.S
\u2022Reliability: A Web system\u2019s perform- ance following long use versus that dur- ing its first 24 hours of use. This type of requirement statement defines the time period for which a Web site must per- form at certain response time levels in order to be viewed as reliable for pro- duction use. For example, the duration may be defined as a week, during which time the performance numbers and sta- bility measurements should be relative- ly constant. The definition of a reliabil- ity requirement should take into account such factors as regular mainte- nance intervals that will restart the site\u2019s machines.
The goal of performance and scalability testing is to monitor and report on the site\u2019s behavior under various load con- ditions. This data will be used later to analyze the state of the Web site and to plan for growth based on expectations
of additional load. This data will also allow costs associated with the project- ed growth to be calculated, based on the required capacities and performance of the site. Formal performance and scala- bility tests are typically conducted at the end of a development iteration, after the functional tests and any corrections have been performed, as these problems may alter the results of the performance tests. It is best practice to conduct infor- mal performance monitoring through- out the development effort, whereas the formal tests are used to validate whether performance-related require- ments have been satisfied.
In order to ensure accurate test execu- tion and results gathering, an automated testing tool should be used to execute the performance tests. It\u2019s almost impossible to conduct performance test- ing without the use of tools, which can simulate thousands of simultaneous users. Many tools are available for this purpose. Test tools can greatly assist in the creation of test scripts and the mon- itoring of end user response times. In addition, load-testing tools typically have a large number of options, includ- ing think time and connection speed, to more accurately simulate end user inter- action with the system.
\u2022 Base performance testing determines the response time and server resource usage of each system function (use case) individually under optimal system conditions. This type of test is per- formed with only one user, to uncover any immediate performance issues with components in the use case. If poor results are recorded for a use case dur- ing the base performance test, it is almost certain to display problems dur- ing load tests. When performance prob- lems are identified during base per-
formance testing, the component\u2019s developer will typically need to investi- gate the problem and correct any com- ponents prior to the execution of any load tests using those components.
\u2022Load behavior is one of the most important areas of analysis. The goal of load testing is to simulate real-world usage to determine response time and server resource use, allowing the calcu- lation of a maximum number of site users per machine. To simulate real users, scripts are created that tie togeth- er many common user actions into vir- tualsessions. Bottlenecks in the system will usually become apparent during this type of test. Load testing is also performed with single operations, or use cases, in order to help locate per- formance issues with specific compo- nents under load. Load testing is per- formed incrementally, with a fixed number of users added per increment. As users are added, the response time and server resource use values will increase. Obtaining these performance measures helps facilitate planning, as the maximum number of site users sup- ported by the system may not be accept- able for future needs. Also, analysis may indicate that the site needs to be scaled in order to accommodate defined performance requirements.
\u2022St re s s testing too consists of simulat- ing access to a Web site by multiple users. Stress testing, however, seeks to determine the behavior of the system once it reaches load limits, when the server can no longer cope with the load. When system load approaches its threshold, the system may reject users or return incomplete pages, or compo- nents and services may crash. Most Web sites strive for graceful degrada- tion under load, through the action of simply rejecting users instead of bring- ing the entire site down. Stress testing can help determine when the system should initiate such corrective action.
\u2022Reliability testing is used to verify that no hidden opportunities for failure exist. Memory leaks, disk file issues, or database transaction log size are prob- lems that may surface only after the sys- tem has been running for long periods of time (Rational Software Corporation 1999).
formance and scalability measures for a Web site, load and stress tests should be run using various configurations of server hardware and with various num- bers of servers at each tier. Consider, for example, using two Web servers, one application server, and one database server; single-processor and multi- processor Web servers; and separate Web and application server machines versus a collocated Web/application server. It is important to test the system using the base configuration in order to assist in determining scalability. A base configuration should generally include the minimum number of servers required for the site to function. For example, a site\u2019s base configuration may consist of one Web server; one application server, if used; and one database server. Measures of perform- ance obtained under this configuration are useful in scalability analysis, as the measures collectively provide abase
\u2022 Database size. It is vital to execute each type of test using multiple data- base sizes, in order to determine how they impact system performance and whether any database schema or config- uration changes are necessary. Schema design, database configuration options, and the use of indexes can have a sig- nificant performance impact when operating on a table with a large number
This action might not be possible to undo. Are you sure you want to continue?