Caching and headers |
[eluser]Arjen van Bochoven[/eluser]
Hey Allard, Thank you for looking at this! [quote author="Derek Allard" date="1226594930"]On quick inspection let me say that a serialized array may be an approach you want to re-consider. If output got large, PHP could blow up in our faces serializing arrays with that data..[/quote] As far as I understand serialize() is not doing much with strings: A string is encoded as size:value; But I must admit I haven't done any benchmarks, can someone confirm the serialize() function can handle large chunks of data? Arjen
[eluser]Aquillyne[/eluser]
It looks like my original solution of just prepending the data with timestamp (like before) plus headers (unlike before) may be simpler and work better, even if less elegant. I'm so glad finally this has gathered some support, I've been battling on this one on-off for months. I did originally post in the feature requests forum too but I believe it was closed as it was considered a double-post. Please add this to the core, it's sorely missing. The core code has a comment saying "we need to add header caching"!
[eluser]narkaT[/eluser]
[quote author="Arjen van Bochoven" date="1226596677"]But I must admit I haven't done any benchmarks, can someone confirm the serialize() function can handle large chunks of data?[/quote] define "large". I've got an application that uses serialize and a database to cache data. the biggest serialized string in the DB is currently about 12kb long, runs fast and stable the js-file I used for benchmarking the code is about 31kb big. I'm really impressed by the effectiveness of (un)serialize. I've never expected it to be so damn fast [quote author="Aquillyne" date="1226598092"]It looks like my original solution of just prepending the data with timestamp (like before) plus headers (unlike before) may be simpler and work better, even if less elegant.[/quote] I'll try to benchmark both versions soon, code-chunks welcome
[eluser]Arjen van Bochoven[/eluser]
I would consider 10MB+ large. If it can handle files of this size faster and in less memory than the original cache functions, it would surely prove that it does not "explode in our face". :-)
[eluser]narkaT[/eluser]
okay here are the results for displaying cached data: 1. a simple "string-disassembling" variant: ~ 0.00016 s 2. serialize: ~ 0.00018 s 3. RegExp: ~ 0.00076 s memory consumption (measured using memory_get_peak_usage) is exactly the same with all 3 methods. the "winner" looks like this Code: class MY_Output extends CI_Output {
[eluser]Arjen van Bochoven[/eluser]
Ok, I did some benchmarking with a 'reasonably large' file (6,2MB) As you can see the memory results are interesting: the CI builtin cache triples memory usage as well as narkaT's strpos() method. My own proposed method roughly doubles memory usage (which could be improved) Not cached: Page rendered in 1.1956 seconds Memory usage: 6.94MB CI 1.7.0 caching using preg_match(): Page rendered in 0.4045 seconds Memory usage: 18.86MB My solution, using serialize(): Page rendered in 0.3594 seconds Memory usage: 12.74MB narkaT's solution using strpos(): Page rendered in 0.4231 seconds Memory usage: 18.9MB @narkaT: You assume the second parameter of set_header() is a single char, but that is nowhere enforced, people can send in an empty string if they want. You should also set the type of the second var (which should be a boolean). edit: Make sure you use the {memory_usage} pseudo variable in your view to measure memory usage, otherwise you get the cached value.
[eluser]narkaT[/eluser]
[quote author="Arjen van Bochoven" date="1226941283"]@narkaT: You assume the second parameter of set_header() is a single char, but that is nowhere enforced, people can send in an empty string if they want. You should also set the type of the second var (which should be a boolean).[/quote] thats right, I should have done some "cleaning" before posting the code I've edited the above code. [quote author="Arjen van Bochoven" date="1226941283"]edit: Make sure you use the {memory_usage} psuedo variable in your view to measure memory usage, otherwise you get the cached value.[/quote] I measured the memory usage directly in the extended output class using memory_get_peak_usage. very "hacky" although. I were confused that in my benchmark there where no difference between the memory consumptions, even though I used the same php-function as CI to get the memory usage. so I used the method with {memory_usage} you suggested. I managed to detect the problem an cut down the memory usage drastically my previous benchmarks reported the same memory usage because I measured the usage before calling the _display-function. So the call to the _display-function was the cause of the large memory usage. passing the output by reference solved that issue for both, my strpos approach and your serialize approach. I've done some benchmaking after optimizing the scripts. both used str_repeat for generating the data (I'll leave out the non cached results). one with 7Mb and one with 50Kb 7Mb build in preg_match(): 21.38MB - 0.0360 s serialize(): 7.4MB - 0.0239 s strpos(): 7.41MB - 0.0352 s 50Kb build in preg_match(): 0.53MB - 0.0041 s serialize(): 0.45MB - 0.0044 s strpos(): 0.46MB - 0.0045 s serialize is clearly the memory-friendliest solution and when caching big chunks of data its also the fastest
[eluser]Arjen van Bochoven[/eluser]
Very nice optimization, passing the data as ref. I think we have a winner here! I've updated the wiki page and added a link to this thread. Arjen
[eluser]Arjen van Bochoven[/eluser]
I've spoken too soon, I forgot Call-time pass-by-reference" is deprecated, so to make this work we have to change the function definition of _display(): Code: function _display($output = '') Code: function _display(&$output = '') I'll revert the wiki to the last version until we sort this one out. Arjen |
Welcome Guest, Not a member yet? Register Sign In |