You are on page 1of 2

Connection Pooling and Sizing of Connection Pools

A question that does not get asked as often as it should is what is the best size for a connection pool to a database. However, the answer to that question can have a major impact the response times your users experience as well as the throughput of the system. Luckily for us, a number of people on the internet have provided a simple answer to that question for different technologies It Depends!. http://dev.mysql.com/tech-resources/articles/connection_pooling_with_connectorj.html http://searchdomino.techtarget.com/expert/KnowledgebaseAnswer/0,289625,sid4_gci851755 ,00.html http://stackoverflow.com/questions/405352/mysql-connection-pooling-question-is-it-worth-it So, let us talk about what are the factors to take into consideration when planning the optimal size of the connection pool. 1. A connection pool that is too small means that queries have to wait to get a connection before they execute. This delay can be quite large in seconds for a loaded system 2. A very large connection pool, on the other hand has its own problems o First of all, connection pools are fairly heavy in terms of memory usage. This will impact the memory OBIEE server uses, which may or may not be a concern depending on whether you are on a 64-bit platform, and the available RAM. o Secondly, if you have a large connection pool, all queries will be sent on to the database. Your database box and/or disk will see a performance degradation beyond their capacity. That means you need two pieces of information to correctly size the connection pool the database capacity for parallelism and the expected concurrency. The former typically depends on the database you run, the machine specs, and the operating system. Your DBA will be able to help determine this based on his past experience. Think of this as the supply of connections. Here is a little spreadsheet to help determine the demand for connections. The numbers here are, of course, made up for illustration only: Name of Factor No of Named Users % logged on at peak times No. of reports on a dashboard Number Cumulative Effect 5000 10% 10 5000 500 5000 714

% time running a dash board 14% For example, if a dash board

runs in 5 seconds, and think time between runs is 30s this is (5)/(5+30) Cache Miss Ratio This will 80% be high for highly customized reports Number of Logical Queries/logical SQL 3 571

1713

This number above approximately 1700 gives you a number of sizing guidelines. For example, if your database will gracefully handle only about 600 connections/queries at a time then you know you need approximately 3 (1700/600) databases to handle this load correctly. Also, if you find for example due to memory issues that an OBIEE instance will only scale up to a connection pool of 200 you know you will need a cluster of about 8 (1700/200) instances of OBIEE to handle this load. Of course, you can also play with the the numbers above to try and get the required connection pool size down. For example, if by creating better aggregates, you are able to get the dashboard running time from 5s to 1s, this can be 5x reduction in the number of required connections.