Learn Logo
Home    |    www.ExamScam.com    |    Leave Us A Message    |    SCJA Java Certification Site    |    Free Mock Cert Exams    |    Free Multimedia Tutorials   

pulpJavaNetwork


Support this site!
(And bill it to your boss)

Buy it now on Amazon!



Search Now:
Google

Database Connection Pooling

 

Session management is on of the most mission critical services the application server provides to our applications.  Managing user centric data on the server, and tying that data to a web based client, makes it possible to create a Servlet and JSP application that maintains meaningful state over an Http network.

From a performance perspective, the session management features of a J2EE application server do not come for free.  In fact, poorly managed sessions, or lazy programmers placing far too much information into an HttpSession can ground your applications to a halt.

Poorly managed HttpSessions are the second most common bottleneck in a J2EE application.  What is the most common performance bottleneck in J2EE applications, you wonder?  Bar none, poorly managed database connections represent the biggest performance bottleneck in a J2EE application.

What is the best way to address the performance problem associated with database connection? The best solution is to use a connection pool, and manage that connection pool effectively.


What is a DataSource?

From a J2EE perspective, a DataSource is just another name for a pool of database connections. For us, a DataSource is a connection pool.

Why do we need to use DataSources?

Connecting to a database, maintaining that connection, and then subsequently tearing that connection down is a highly resource intensive activity. Interacting with a database is one of the most significant performance cost your applications will have.

Performance specialist say that 80% of all performance gains can be achieved by fine tuning your use of database connections.

Of course, the same group of people also say that 80% of all performance gains are achieved by fine tuning the type of data you place into your HttpSessions, so you’ve got to take these types of statements with a grain of salt.

The point remains. Connecting to a database has a high cost associated with it.

 

With a typical J2EE application, almost every user is going to interact with your database at some point in time, and those interactions are likely going to be very short and to the point.  The client will grab data from your database, surf your site for a while, and then maybe save some data to your database.  But each client will by no means need a permanent or extended connection to your data.

The Database Connectivity Problem

Now here’s the problem: Setting up and tearing down database connections is extremely resource intensive. 

Instead of creating a new connection every time a client needs access to database, a developer should set up a collection of connections. One hundred client may be accessing our site, but we may be able to service all one hundred clients with only ten database connections. To do this, we set up a pool of semi-permanent database connections, and when a client needs a connection, we just give then one out of the collection.

Rather than creating and destroying a database connection every time a client wants to access our database, we just create a small pool of connections, and grab a connection from the pool when a client needs one. 

Connection pooling is incredibly efficient, and it frees up a bunch of resources that can then be used to address all of the other bottlenecks in your application.

How do I create a connection pool?

A few years ago, if you wanted to  pool your database connections, you had to either write your own code, or download some code off of the internet that implemented pooling. You then used that connection pooling code in your applications.

Of course, databases are very complicated things; far too complicated for us lowly Java developers to try and write connection pooling code for. As a result, the custom pooling code, or even the code that may have been downloaded off the internet, usually resulted in stale connections, and all sorts of cryptic SQLExceptions.

To address both the need for connection pooling, and the need for a good implementation of connection pooling, open source projects like Struts and Cocoon built connection pooling right into their framework. These implementations usually work pretty well.

Of course, not everybody is using Struts and Cocoon, but every application certainly needs them, so the J2EE community brought the concept of connection pooling into the specification. Now, all J2EE compliant application servers must implement a connection pooling service that can be created and configured by an administrator, and subsequently accessed programmatically through the JDBC extensions API.

I am using WebSphere and Struts. Should I use the pooling facilities of Struts or WebSphere?

While open source projects such as Cocoon and Struts provide connection pooling facilities, if you are using WebSphere, you really should be using the connection pooling facilities of WebSphere.

First of all, connection pooling is a significant competition point as far as J2EE application servers go, and a big part of having the best J2EE application server on the market is having a very efficient connection pooling mechanism. IBM puts a lot of work into making their connection pools as efficient as possible.

Secondly, if you are using a DB2 database, IBM has built all sorts of wizardry and magic into their DB2 connection pools. Believe it or not, but IBM knows a lot more about DB2 than you, me, or even those clever cucumbers at open source projects like Struts and Cocoon.

Thirdly, WebSphere monitoring tools such as Resource Analyzer can track WebSphere connection pools in much more detail than they can custom connection pools. Monitoring various properties of connection pools is pivotal in optimizing their configuration.

Fourthly, if you have a problem with your WebSphere connection pool, you can get fast, quality support from IBM. This won’t likely be the case with open source connection pools.

Finally, you paid for WebSphere, so why not use it? IBM works pretty darn hard to get their connection pooling to work right. Take advantage of all of that hard work.

What is a JDBC provider?

In order for WebSphere to connect to a database, it needs some software provided by the software vendor that makes that database . This database connection software is known as a database driver.

IBM provides a set of database drivers that makes it possible to connect to db2. These database drivers are zipped up in a file named db2java.zip. Oracle and Microsoft also provide drivers that make connecting to their databases possible.

A JDBC provider states the name of your database driver, the type of database to which that driver connects, and where that driver is stored on your file system.

Before you can create a connection pool, you must first register a JDBC provider through which WebSphere can establish connections to your database.

 

What properties can I configure for a connection pool?

Administratively setting up a connection pool, and having your developers simply access that connection pool simply isn’t good enough. If you want to optimize performance, you must tweak your connection pool. Here’s a look at those tweakable parameters:

Minimum Connections:

This setting is the lower limit on the number of connections that will exist in your pool.  This is also the number of connections that will be created when the server starts up for the first time. Keep this number as low as possible, since lazy initialization, which implies initializing resources only when they are first needed, is a preferred design practice.

Maximum Connections:

This is the maximum number of connections that can ever exist in the pool.

You may have a license restriction on the number of connections your database can maintain. In that case, you maximum number of connections might equal the number of licenses you have.

From a performance standpoint, you should have done some type of load testing on your database, and from this testing, come to a conclusion about what the maximum number of connections is that your backend can maintain before a significant degradation in performance is observed. The maximum connections parameter should then be set to this upper performance limit. After all, what’s the point of supporting one more user, if it destroys the experience of the thousand users already using your site?

Developers should prepare for handling the exceptions that get thrown with the connection limit has been reached. When this happens, new users who require a database connection should be given a polite message about traffic being excessive. Politely ask them to come back later, and they will.

Unused Timeout:

After a connection has splashed around in the pool for a certain amount of time without actually being used, it’s obvious that the connection isn’t needed anymore. The unused timeout setting dictates how long a connection can fool around in the pool before it’s told to towel off and go home.

Aged Timeout:

Aged timeout is how long a connection can exist before it’s removed from the pool. The connection may be perfectly healthy when it reaches its aged timeout, but it will be destroyed regardless.

Quite often, firewalls and proxies will have a maximum amount of time that a connection can exist through them.  Setting the aged timeout to be less than the firewall or proxy timeout will help you avoid getting mysterious stale connections.

Reap Time:

Reap time represents how often the connection pooling thread, that times out old connections, and removes unused connections from the pool, will run.

Connection Timeout:

If all the connections in the pool are in use, the connection timeout represents how long the DataSource will let a client wait for a connection to become available.  If this time limit is hit, a ConnectionTimeoutException, a subclass of SQLException, is unapologetically thrown back to the client.

Purge Policy

When a stale connection, or fatal connection error happens, you have the option of terminating the failing connection, or purging every database connection in the pool.

 

 

When accessing a connection pool, my developers are providing unique usernames and passwords. How does this effect pooling?

This completely defeats the process of database pooling.

All connections in a pool must have the same access rights associated with them.  If you create a new connection with a different username and password, it will exist in its own separate pool. If every connection is given its own username and password, each connection is a pool unto itself.

Be careful when using usernames and passwords to create DataSources. To track database access, using a logging mechanism would be a better option than matching each connection with a username and password.

How does a connection pool work?

When your server starts up, the connection pool process creates a number of connections to your database; this is based upon the minimum number of connections specified in the DataSource properties window.

Each time a client asks for a connection, the DataSource gives the client one from the pool. When a client is done with the connection, the connection is returned to the pool, rather than being destroyed.

If the number of connections in use equals the number of connections in the pool, the DataSource throws a new connection into the pool, just in case a client makes a request for one.  The DataSource will keep throwing connections into the pool until the number of connections in use stabilizes, or the maximum number of connections allowed is hit.

If the maximum number of connections is hit, and a client asks for another connection, the client will wait the number of seconds specified by the wait timeout property of the DataSource. If none of the existing connections are returned to the pool in that time, the client will be thrown a ConnectionWaitTimeoutException.

Every once in a while, as specified by the reap time, the DataSource will poll each of the connections and ask them how old they are and how long it has been since they were last used by a client.  If any connections have either hit the aged timeout, or the unused timeout, those connections are removed from the pool.

If database traffic dies down, connections will continue to be removed from the pool, due to the unused timeout, until the minimum number of connections is met. The number of connections will remain at that minimum number until database activity increases.

This is the ying and the yang associated with managing database connections through a DataSource.

Home    |    www.ExamScam.com    |    Leave Us A Message    |    SCJA Java Certification Site    |    Free Mock Cert Exams    |    Free Multimedia Tutorials   
sbs
eXTReMe Tracker