Welcome Guest, Not a member yet? Register   Sign In
thinking about sessions and databases
#1

[eluser]jbowman[/eluser]
I've been thinking about sessions and databases, and think I might have come up with a solution that will work well for both performance and security reasons.

I'm trying to keep database queries down to a minimum, but I want to have session data that includes information I won't trust coming from a user supplied cookie. I also want the site to have the ability to scale horizontally for the web servers serving it, which means I need to keep the session information in a shared resource, such as a database.

I think the answer is this. User sessions get stored in the database, and cached as a file on the web server. The session process would then work like this..

Check for a session id supplied by the browser.
If the session id does not exist, start a new session.
If the session id does exist, check the local file cache for the session.
If the cache file does not exist, check the database.
If the session exists in the database, create a cache file and continue.
If the session does not exist in the database, then start a new session.

This way if you have multiple web servers, and load balancing has the user move to a different server for some reason (I'd rather not get into the persistence features of NLBs, as web servers themselves can go down), a new cache file is created as the only database request for the session.

Any thoughts or comments? I'm currently using NGSession and may try to add this functionality once I get further into the development of my site, if no one else beats me to it.
#2

[eluser]webthink[/eluser]
I suppose the question here becomes do the cycles you save from reading a file locally outweigh the cycles you spend actually writing the file? Keep in mind as well that anytime the session is updated you'll have to write to the central session share and either rewrite the local file or delete it. You'd only really get a handle on the true potential of this solution by knowing your application really well... ie knowing exactly how often userdata gets written. Also keep in mind that the number of times you have to write to the cache increases the more often users moved from one server to another and also if they are moved back to a server they were on previously they'll pick up their old session cache which may contain old data so you'll need to come up with a mechanism for ensuring they always have the most recent data.

Additionally your solution could be made faster by storing the cache in memory on the local machine by using memcache or a memory table rather than files.

If you want to see if your solution will save you any processor time you could work it out something like this
(keep in mind the actual numbers I'm using are completely made up)

x = ((n2 - n1) - ((n3 * p1) + (n3 * p2))) * r1
where:
x - time saved
n1 - amount of time it takes to read from centralized session db (let's say 2).
n2 - amount of time it takes to read from local file cache (let's say 1).
n3 - amount of time it takes to write to local file cache (let's say 1.5).
p1 - probability of userdata being updated per request (let's say 0.5)
p2 - probability of user being moved from one appserver to another (let's say 0.25).
r1 - total number of requests (we'll just do a single request for now so 1).
so:
x = ((2 - 1) - ((1.5 * 0.5) + (1.5 * .25))) * r1
x = -0.25

Those numbers were just plucked out of the air but they do illustrate how you could actually slow down your application using a solution like this. Of course those numbers could easily juggled to show great time savings. The point is that it only works as a solution if it matches your application.


sidebar: This is kind of off topic but along the same lines of improving the efficiency of session library. One easy change that can and should be made to NGSession is to change the set_userdata method to store the userdata in memory and only make the update to the db once in the destructor additionally checking the serialized data against the new data and if nothing has changed write nothing at all.
#3

[eluser]jbowman[/eluser]
The validity of the cache file if users are bounced server to server is an issue I hadn't thought of, thank you. Will need a work around for that. Every write, in the database or local file is going to be necessary no matter what. In the end, you'll find much better performance writing locally to a file.
#4

[eluser]webthink[/eluser]
My point was that you'll have to do a write to the local file (or replace it) *as well as* writing to your centralized session db. Each time you do this the savings you incurred by reading locally are lost to some extent.
#5

[eluser]jbowman[/eluser]
I had a longer reply, but I think we're basically agreeing. You're coming at it from an angle of if you have a lot writes to your session data, then all performance gains would be lost. I'm seeing session data as something you write to rarely, so the performance gains of never hitting the database would be huge. I heartedly agree that if you have a lot of writing to your session data, then the net gain of my solution would be very small, and could actually have a negative impact on performance. In that instance, yes something like memcached or at least some basic query caching would make more sense.
#6

[eluser]webthink[/eluser]
Yeah exactly. That's why I provided the formula... Every write takes away some incremental performance gain but it might be outweighed by a low probability of moving users from server to server or by really fast local read speeds. Even with in a system with lots of writes you could end up with a faster system by cacheing locally if the numbers added up.

The reason I mentioned memcached was actually to compliment your solution because you could implement just as you've described it writing to a local cache but gain even more benefit by not writing to files but rather writing to memory. Same goes for writing to a local mysql memory table with the added advantage that the code to write to and read from your cache looks just like the code to write to and read from your central session db.

Anyway we're definitely on the same page. I'd like to see your solution implemented.
#7

[eluser]jbowman[/eluser]
I'm writing my site right now, and am on as much of a roll as I can be for writing at home with a 7 mo old daughter. So I won't stop to write this solution immediately. I'll come back to it. Basically, the change I'm going to make is to try and make sure it supports more than just a local file for the cache, and instead, make it more API oriented so things like memcached can be implemented.
#8

[eluser]jbowman[/eluser]
figured out a solution for the variance in server cache files. The browser will need an additional cookie, session_version. So session_id can be used for identification, and session_version can be used to validate the cache, with it being incremented at each write to the session information.
#9

[eluser]Daniel Eriksson[/eluser]
An easier solution to this is to simply use the ability of PHP native sessions to persist to a database (instead of a local file). If you are worried about DB performance then use an in-memory database engine (the session data will be queried on every page load afaik).

It provides both security and scalability, and adds no extra complexity to your code (apart from hooking the session up to persist to a DB).

/Daniel
#10

[eluser]Sean Murphy[/eluser]
Unless you're using a shared disk like NFS across all of your application servers, the more servers you have, the less efficient file caching will become.

Depending on the importance of what you're storing in your sessions, using only memcached may be an option. It would be faster than any combination of DB or file storage, and scales much better. The only thing is if one of your memcached servers goes down you have to be okay with loosing all the session data. A shopping cart would be one case where that wouldn't be acceptable.




Theme © iAndrew 2016 - Forum software by © MyBB