Improved Session Tracking
September 22, 2006
I recently discovered a better way to handle session tracking in web applications while dealing with complaints from the users of our application about session interference problems. Session tracking refers to the process of reassociating session data stored on the server with an incoming HTTP request. I will give an overview of existing session tracking strategies and explain their drawbacks. Following that discussion, I will present my solution.
A web container can use a couple of methods to associate an HTTP session with a sequence of user requests, all of which involve passing an identifier between the client and the server. Java servers typically use a token named JSESSIONID, while PHP uses PHPSESSID. The identifier can be maintained on the client as a cookie (known as cookie-based sessions) or the web container can include the identifier in every URL that is returned to the client (known as URL rewriting or encoding). Cookie-based sessions are particularly problematic with the Mozilla line of browsers since cookies which are set to expire at the end of the browsing session (session-scoped cookies) are shared amongst all windows and tabs for a single user profile. Internet Explorer (IE) segregates each window as a different browser session. Since IE doesn't have tabs to consider (at least prior to IE7), it doesn't have this isolation problem with session-scoped cookies. For browsers that do share session-scoped cookies, it is not possible to have multiple HTTP sessions without them interfering with each other.
When the application uses cookie-based sessions, depending on how session data is used, the user may see unexpected behavior when attempting to establish a new session in a separate window (or tab). A sample use case would be when the user wants to create a new session using different login credentials. If the application does not check for the presence of the existing session when the user tries to establish this new session, instead allowing the normal authentication logic to execute, the user may end up with a schizophrenic user principal that has a cocktail of roles and user information. Another approach the application could take is to automatically terminate the outstanding session, which leaves the original window (or tab) in an expired state, only to be discovered the next time the user activates a link or form in that window (or tab). In this case, the two tabs are stealing sessions from one another. The last method the application could take in handling the request is to restrict access to the login page if there is an outstanding session, preventing the user from establishing a second login all together. While this third scenario is the safest bet, none of these options are ideal for the user.
The alternative to cookie-based sessions is URL rewriting. In this strategy, every link or form is processed by the application, which inserts a session token into the URL, either as a query string parameter (PHP) or as a proprietary URL syntax (Java). Multiple tabs and windows no longer pose a problem with concurrent sessions because all the information necessary to reestablish the HTTP session during subsequent requests is stored in the URL, and hence completely isolated. If a request is made using a URL without this token, the application creates a brand new HTTP session and rewrites all the URLs sent in the response with this new identifier. While this strategy may sound like a sure winner over cookie-based session tracking, it has one rather severe security flaw. Since the information that identifies the user's session is now part of the URL, the session is completely portable. The URL can be copied from the browser location bar into and e-mail and sent over the internet to a recipient, who can then copy that URL into a browser window and access the session. You don't have to be a security expert to realize the risk in this scenario. This example doesn't even consider the less obvious risk of picking off session tokens in the URL sent over an insecure network with a traffic sniffer. Using this strategy out of the box is not recommended for anything other than development.
Fortunately, there is a third approach that would allow the use of URL rewriting in a secure manner. (To ensure maximum security in general, SSL should always be used). The trick is to lock a session to its originator. However, none of the request headers sent by the browser can satisfy the requirement of distinguishing one browser from all others. The most logical candidate, the remote IP address, does not work since it can be shared by users behind a firewall or proxy. After thinking about the Urchin Tracking Module (UTM) used by Google Analytics, it finally dawned on me that it is possible to uniquely identify a browser by using a cookie to assign a differentiating visitor identifier. This concept is similar to a MAC address for an ethernet card. The very first time that the application's URL is requested by a browser, interceptor code (for instance a servlet filter) calculates a long random string (the more random the better) and assigns this value to a permanent cookie in the browser. To make the cookie permanent, set it to expire several years in the future. The interceptor code also places this token into the newly created session to signify its originator. In Java, this is done by consulting the isNew() method on the HttpSession object and only binding the originator token to the session if this flag is true. On all subsequent requests for that session identifier, the value in the cookie is compared to the originator token in the session. If the two differ, the user is redirected appropriately, either to an error page or by stripping the session identifier from the URL, forcing the user to create a new session and thus login once again.
Even though a cookie is used in this strategy, it is a permanent one, and it is intended to be shared between tabs and windows. If you look at your browser's cookies, you will see that Google Statistics uses this approach to uniquely identify your browser so that it may track unique page hits with a cookie named __utma.
As a final note, it may be necessary to ensure that the JSESSIONID cookie is removed from the client in order for the servlet container to prefer the URL rewriting approach.