Welcome Guest, Not a member yet? Register   Sign In
SimplePie: Detecting new feed entries to add to the Database (which is the most efficient way?)
#1

[eluser]Unknown[/eluser]
Hello Everyone,

This is my first post here on the CI forum. I just started working with CI and PHP and so far I managed to build a small RSS feeder/aggregator pretty fast, using SimplePie.

I have some sort of philosofical question regarding feed readers/aggregators:

If my cron job fetches all data from a certain RSS feed from time to time, how do I identify which posts are new? I know I can get a MD5 guid with get_id() using SimplePie, but ... then do I have to iterate all DB entries and check if this guid hasn't been added yet ?

Isn't this super inneficient ? Tongue

Or should I iterate all stored entries once, and get all the guids, so that future checks are made from memory? (much faster...)

Your insight is much appreciated Smile
Best Regards to all, and special thanks to the EllisLab team for such a great product
PRica
#2

[eluser]Kemik[/eluser]
Just keep track of when you last checked and only add posts which are dated after the last check.
#3

[eluser]richthegeek[/eluser]
md5 the whole feed and store that hash in the database along with the feed URL and ID. the hash will only be 32 characters long, so string comparison isn't a problem, and the feed will remain the same until a new entry is added.

Bear in mind that some RSS feeds just use the current time as the "last updated", so you will need to only hash the entry list, and not rely on the last updated time.




Theme © iAndrew 2016 - Forum software by © MyBB