Welcome Guest, Not a member yet? Register   Sign In
16,777,216 File Cache Problem?!
#1

[eluser]taggedzi[/eluser]
Greetings.

I am having a problem, that I do not think most people will encounter. My "/system/cache" folder keeps filling up. I'm talking about 16k files in 2 days at over 30k each that is over 500 mb. (I'm afraid to see what happens if I leave it alone for weeks... I'd probably hit the server file count limit per folder eventually...?)

Here is my dilemma:

I am dynamically generating color pallets via PHP given an input. This is somewhat cpu intensive (not horribly so... but given enough requests it can be significant.) To save on CPU and prevent possible problems I decided to implement Caching, so it could calculate the pallet once, generate it, and not have to do so again. (I have found that when a color is called it is called several times in a day, thus caching for 24 hrs seems to make sense.) It would simply serve up the file already made. This worked great for CPU usage. However, when ever I get indexed by any site (that doesn't follow robots.txt). I lose 500 meg of my storage space and fill my cache.

Is there a good way to limit the number of files allowed in the cache, or possibly the total size of the cache folder. (eliminating older files to make space for the new ones possibly?)

I could just create a cron job that empties the cache folder every so often... but wondered if Codeigniter had something build into the cache controls to help.

Since, there are 16,777,216 potential files that can be generated. (One file for each of the HTML hex color codes.) I would really like to find a good way to limit this, without counting on a cron job to just empty the folder.

Here is a link to the color wheel that links to many of the pallets:
http://taggedzi.com/colors/hues

Clicking on any of the colors takes you to a different page that shows the mathematically calculated color pallet for that color. Each one of them, once called, is cached...
As an example:
http://taggedzi.com/colors/color_pallets/D4352B

So, I guess I'd like to know if there is a way to control this via Codeigniter, and if not I guess this needs to get moved to feature request?

Thanks in advance.
TaggedZi
#2

[eluser]BorisK[/eluser]
This is what I use to wipe the cached dictionary search results:
0 0 * * * nice -19 find /www/site/ci/system/cache -atime +4 -exec rm {} \;

At midnight it checks the cache directory and removes all cached entries which have not been accessed in 4 days. The most commonly searched terms would never be deleted from cache.
#3

[eluser]taggedzi[/eluser]
Ok, I figured I'd have to make a cron job for it. I will probably write a php extension that does the same thing and call it via cron.

I'll also make a feature request to Codeigniter for some storage controls as well. If I feel up to it I might make some controls to perform that same job an post them.

I'll keep an eye on the post to see if anyone else has anything else...

Thanks for the input.
#4

[eluser]vivi[/eluser]
Hi,

this is somewhat unrelated to your problem, but I noticed you're using a big table for your Color wheel; you could instead use a form with an <input type=image/> tag. When the user clicks on the image it will submit the form with the mouse coordinates in name.x and name.y fields. This would substantially lower the page size and may also solve your crawler problem.
#5

[eluser]taggedzi[/eluser]
Interesting, I really like the idea. It would take major work to re-write it to get it to do that, but I like the idea.... If I get some spare time I will definitely consider doing. Thank you.
#6

[eluser]BorisK[/eluser]
Also, if disk space is the issue, I would consider cacheing the serialized data, instead of HTML. It's a very small CPU overhead to recreate the HTML from cached data. Or you can cache it in JSON format and serve the JSON to your HTML, so that the tables or whatever structures you have are created on the client side. This way you have low disk usage and no extra CPU on the server.
#7

[eluser]Phil Sturgeon[/eluser]
This is not something we should ask or expect CodeIgniter to do by itself. If it starts checking the ages of loads of files each page load your site is going to get SLOW, and a lovely light framework has just turned into a slow beast.

File based cache clearing has always been a job for cron scripts on large sites and probably always will be.
#8

[eluser]taggedzi[/eluser]
Actually some cache management functions would be very handy. I do not expect CI to take on complete management of my cache, but tools to help me (and other developers) would be nice.

Particularly the ability to:
-Purge files older than (x) date or (x) minutes old
-Purge over (x) number of files (oldest deleted first)
-Purge over total size (x) (oldest goes first)

I realize it is a lot to ask, but I can certainly see the need for it. I agree that on some sites (like mine) a cron job and a shell script would work best. However... I can see plenty of places where functions like this could be useful (at least if people did not abuse them.)

I understand that abuse of those functions would cause problems, but the proper usage of them could be very valuable and significantly help performance. Especially for people who do not have access to cron or a shell of some kind.

Smile

Just my 2 cents.
#9

[eluser]WanWizard[/eluser]
To block spiders, you could require a session to be present. Most spiders don't accept cookies, so you could detect them that way. Also, a lot of them can be detected using their User Agent string, that you could use.

I agree with the above that cache management should not be part of your online/interactive code, with this number of files your site will be unusable, and your visitors will leave.
If you do want it there (because you don't have system access, use the "post_system" hook, that runs after all output has been sent to the client (providing you don't have any compression enabled in php.ini).




Theme © iAndrew 2016 - Forum software by © MyBB