Monday, May 7, 2007

Storing PHP Sessions in a Database

There are many reasons to utilize sessions when creating a web-based application using PHP. Session information, by default, is stored in a file on your web server. But what if that becomes a problem? In this article, I'll talk about why you might want to move your PHP sessions to a database, and show you how to do it.

So you've finished your largest project yet, a robust order-taking system for a very successful company. To give the best user experience possible, you made heavy use of sessions in order to keep information about the customer handy, from page to page, as any good PHP programmer would. You deliver the site to your customer and it goes live, to great fanfare. Traffic is slow at first, but as the customers begin to use it, traffic picks up and in a few months the site is serving over 5 million hits a day, between shoppers, search engines, and actual customers. The web server is dying under the stress, and the company turns to you to figure out how to increase their bandwidth.

A quick analysis of the database shows you that its usage is quite low. It's the actual load on the web server that's causing the issue. Your code is tight and uses caching where available (as is obvious by the low load on the database). The problem seems to be the sheer traffic on your web server. How do you cope?

The above problem is one we should all be so lucky to face: being overly popular. If your website is as wildly successful as the one in the above example, it means you should have the resources available to address this kind of issue.

And just how do you fix this problem? In our example, let's address the problem by adding web server(s). It doesn't matter if you round-robin them, or add them to a true load balancer. But adding additional servers will allow the load to be split among more than one machine, allowing each of them to serve data in a more efficient fashion.

This will solve the immediate load problem. But what will it do to your application?

For those of you who use sessions but have never written an application for multi-server distribution, you may be surprised to know that your sessions will fail miserably if left alone. Why is that?

Sessions, by default, write to a temporary file on the web server. So if you choose to store data in your session (a user's name, for example), it is available on any page just by reading from the session. This works great, until you bring more servers into the equation.

Think about this for a moment. Let's say you have three web servers, all with the same website on them. Furthermore, these web servers are set up as a round robin. A round robin means that when a new request comes in, it's handed to the next server in the series. So with three web servers, requests would be handled in the order of "1, 2, 3, 1, 2, 3, etc." This means that, as a visitor is surfing your site, they are potentially visiting different servers in the same session. With an intelligent load balancer, the handling of connections is not as crude, but it is still possible for a user to visit a different server with each click of the mouse.

So now, let's take the users on your site. As they click through your web pages, they will be moving from server to server. So if you saved something to a session variable while you were on server 1, it would not be available to you if your next click took you to server 3. This doesn't mean that you coded your application incorrectly, it merely means you need to reconsider the session configuration.

With more than one server hosting the same website, your options narrow. If you wish to keep using disk space to store your session information, then all of your web servers need to mount the same share, so they all have access to the file. Another option, and the one we are going to explore in better detail, is to store your session inside a database instead of on disk. This way, your session information is available no matter which web server you are on.

Luckily, PHP has a built-in ability to override its default session handling. The function session_set_save_handler() lets the programmer specify which functions should actually be called when it is time to read or write session information.

s I just said, the session_set_save_handler() function will allow us to override the default method of storing data. According to the documentation, here is the format for this function:

bool session_set_save_handler ( callback $open, callback $close, callback $read, callback $write, callback $destroy, callback $gc );

In our example, I'm going to create a session class that can be used to store information to the database instead of to a file.

For starters, create a new file called "sessions.php." Inside this file, put the following code:

class SessionManager {

var $life_time;

function SessionManager() {

// Read the maxlifetime setting from PHP
$this->life_time = get_cfg_var("session.gc_maxlifetime");

// Register this object as the session handler
session_set_save_handler(
array( &$this, "open" ),
array( &$this, "close" ),
array( &$this, "read" ),
array( &$this, "write"),
array( &$this, "destroy"),
array( &$this, "gc" )
);

}

}

In the above example, the SessionManager() class and its constructor are created. You will notice that instead of merely passing function names into the session_set_save_handler() function, I sent arrays allowing me to identify class methods as the intercepts for the session actions.

On the previous page, when we called session_set_save_handler() to override the session handling functions, the first two arguments passed were for the open and close logic. Let's add these methods to our new class, then take a closer look.

function open( $save_path, $session_name ) {

global $sess_save_path;

$sess_save_path = $save_path;

// Don't need to do anything. Just return TRUE.

return true;

}

function close() {

return true;

}

The above code should be added inside the SessionManager class we started on the previous page.

Taking a close look, you will see that, for the most part, we didn't do anything in the open and close methods. That would be because we will be writing our information to a database. In this lesson, I am assuming that the application you are adding this to ALREADY has an open database connection to your database. If this is the case, then the above code is good as it is. If not, then the open function would need to include code for creating your database connection, and the close would close said connection.

If we were writing a new file-handling session manager, the open function would handle opening the file descriptor. The close function would then close said file descriptor, to prevent data from being lost.

So far, so good. But before we get to the point where we are actually writing the session information into the database, we need some place to put it. In your application database, you will need to create a "sessions" table to store the session information. Here's the one I defined, in MySQL create format:

CREATE TABLE `sessions` (

`session_id` varchar(100) NOT NULL default '',

`session_data` text NOT NULL,

`expires` int(11) NOT NULL default '0',

PRIMARY KEY (`session_id`)

) TYPE=MyISAM;

I would definitely suggest adding this table to your application's database. Not only does it keep it together with everything else, but it also makes it possible to share the same database connections for your sessions and your application itself.

Okay. We've intercepted PHP's session handling logic, and are systematically replacing it with our own. After the open and close, the next to areas to address are the read and write methods.

Let's first take a look at the read method:

function read( $id ) {

// Set empty result
$data = '';

// Fetch session data from the selected database

$time = time();

$newid = mysql_real_escape_string($id);
$sql = "SELECT `session_data` FROM `sessions` WHERE
`session_id` = '$newid' AND `expires` > $time";

$rs = db_query($sql);
$a = db_num_rows($rs);

if($a > 0) {
$row = db_fetch_assoc($rs);
$data = $row['session_data'];
}

return $data;

}

In the above example, you will see that I used functions called db_query(), db_num_rows(), and db_fetch_assoc(). These functions are from the application I wrote this class for.

But a close look will show you that when the function is called, the unique session identifier is passed along with it. I then query the database to see if I can find a record for that session that has not expired. If successful, you return the data to the calling program.

Now take a look at the code to write the data.

function write( $id, $data ) {

// Build query
$time = time() + $this->life_time;

$newid = mysql_real_escape_string($id);
$newdata = mysql_real_escape_string($data);

$sql = "REPLACE `sessions`
(`session_id`,`session_data`,`expires`) VALUES('$newid',
'$newdata', $time)";

$rs = db_query($sql);

return TRUE;

}

In the above example, you see that the write function is passed the unique session identifier, as well as the data to save to the database. One thing to note is what we are doing with the time. We grab the current time, then add to it the number of seconds that were defined in the constructor as lifetime. So basically, each time the data is written, we reset the timeout. So if your system is configured to expire sessions after 20 minutes of inactivity, this code supports it.

You will also notice that, when writing the database, we utilize the replace function instead of an insert. Replace works exactly like an insert if the record already exists, or only updates it.

And assuming all went well with the update, we return true.

You can't properly have your own session handler without writing code to clean up after yourself. The last two methods for our class will be the destroy and gc (garbage collection) methods.

When the session_destroy() function is called, it triggers a call to our destroy method. Here's a look at the code:

function destroy( $id ) {

// Build query
$newid = mysql_real_escape_string($id);
$sql = "DELETE FROM `sessions` WHERE `session_id` =
'$newid'";

db_query($sql);

return TRUE;

}

The above logic is fairly straightforward. The destroy method is called, passing the unique session identifier. We then make a call to delete the record. Boom, session information no longer exists.

Periodically, PHP will trigger a garbage collection routine, meant to handle the conditions where people left before the system got a chance to clean up their sessions. Here's the code:

function gc() {

// Garbage Collection

// Build DELETE query. Delete all records who have
passed the expiration time
$sql = 'DELETE FROM `sessions` WHERE `expires` <>

db_query($sql);

// Always return TRUE
return true;

}

In the above query, we delete all records that are expired, and SHOULD have been deleted by the destroy method, but the method was apparently never called. This garbage collection could be considered "self-maintenance," and stops you from needing to write a cronjob to keep your sessions table clean.

On the last few pages, we've talked about a bunch of little pieces that make up the sessions class. I thought I would take a minute and give you the complete class in one shot.

class SessionManager {

var $life_time;

function SessionManager() {

// Read the maxlifetime setting from PHP
$this->life_time = get_cfg_var("session.gc_maxlifetime");

// Register this object as the session handler
session_set_save_handler(
array( &$this, "open" ),
array( &$this, "close" ),
array( &$this, "read" ),
array( &$this, "write"),
array( &$this, "destroy"),
array( &$this, "gc" )
);

}

function open( $save_path, $session_name ) {

global $sess_save_path;

$sess_save_path = $save_path;

// Don't need to do anything. Just return TRUE.

return true;

}

function close() {

return true;

}

function read( $id ) {

// Set empty result
$data = '';

// Fetch session data from the selected database

$time = time();

$newid = mysql_real_escape_string($id);
$sql = "SELECT `session_data` FROM `sessions` WHERE
`session_id` = '$newid' AND `expires` > $time";

$rs = db_query($sql);
$a = db_num_rows($rs);

if($a > 0) {
$row = db_fetch_assoc($rs);
$data = $row['session_data'];

}

return $data;

}

function write( $id, $data ) {

// Build query
$time = time() + $this->life_time;

$newid = mysql_real_escape_string($id);
$newdata = mysql_real_escape_string($data);

$sql = "REPLACE `sessions`
(`session_id`,`session_data`,`expires`) VALUES('$newid',
'$newdata', $time)";

$rs = db_query($sql);

return TRUE;

}

function destroy( $id ) {

// Build query
$newid = mysql_real_escape_string($id);
$sql = "DELETE FROM `sessions` WHERE `session_id` =
'$newid'";

db_query($sql);

return TRUE;

}

function gc() {

// Garbage Collection

// Build DELETE query. Delete all records who have passed
the expiration time
$sql = 'DELETE FROM `sessions` WHERE `expires` <>

db_query($sql);

// Always return TRUE
return true;

}

}

Throughout this article, I've talked about possible reasons for wanting your session information in a database. I've also showed you how to create a sessions class that would replace the built-in PHP session handling. Now I just need to show you how to implement it.

Let's go back to our example from the beginning of the article. You have an application built that uses session information heavily. Most likely, you have common functions and logic broken out into include files, including the database connection and session starting code. If you actually code the session_start() function in each file where you lose it, you'll have a bunch more work to do.

In any case, here's the code to add to your application in order to make PHP use your class instead. Find your session_start() code, and make it look like this:

require_once("sessions.php");
$sess = new SessionManager();
session_start();

So in the above example, we require the code. This adds the class declaration into memory. With the second line, we invoke the SessionManager() class, which in turn reworks the session_set_save_handler to call class functions rather than the default ones. And finally, the session_start() function is drawn, which begins using the new DB class right away.

For most structured code, the hardest part of the scenario would be to replace the db_() calls with calls from your own application's db api. Once your class is created it takes mere seconds to implement it to your code. Copy that code across all of your web servers, and they instantly begin using the database instead of files to read/write as default.

One last warning before I go. Depending on how your code is structured, PHP does not always automatically save any session data. To be certain you are retaining your session data all the time, be sure to call the session_write_close() function at the end of each page.

I hope the information I've shared with you during this article has, at a minimum, helped you understand how and why you would approach moving session data from files to a database. Whether you use the example code I've supplied, or create your own from scratch, its been my goal to educate you on the whys and hows, so you can make the best decisions for your own applications.