Welcome Guest, Not a member yet? Register   Sign In
Forum system - how to track already read vs new messages, per user
#1

[eluser]jedd[/eluser]
I'm writing a forum system - remarkably simple to write, actually. I've stolen the approach of the CI forums - so I have forums, threads, and messages as a kind of logical presentation hierarchy.

I want to know how you'd go about tracking what threads or topics have new messages, for a given user. As in, what table structure you'd use.

My problem is that I have one (very big) message table, a thread table, a forum table, and of course a user table - and I think that I want to be able to flag each thread as unread/read, rather than per message. Consequently I'm floating the idea of a table with [ user_id | thread_id | message_id ] and adjusting that on every message-read by a given user to denote the latest message in a given thread. However, it doesn't feel very elegant (though I've not cut any code around the idea yet).

I've noticed that the CI Forums seem to cheat a bit with this tracking - older pages of stuff that I have definitely not read, show up as non-new (or read, if you prefer). So it feels like they are doing something a bit tricky and, well, space-conserving in their tracking system.

Anyway, yes, the question is - what kind of data structure(s) would you use to track this, and if you do something a bit dodgy or ephemeral, a la CI's Forums, how would you do that?

I'm happy to post my schema for forum/thread/message/user - though I think that'll be more distracting than helpful, especially as I'm willing to make some savage adjustments to the structure if I need to.
#2

[eluser]TheFuzzy0ne[/eluser]
I'd also like to know. Apparently, there are libraries in existence which take care of it all for you. Trouble is, I can't seem to find any.

I was thinking of tracking it through the database, giving each user 250 rows to track the most recent posts they have read. Any threads older than the date of the oldest read post, will display as being read automatically. It seems a bit too simple to me though. I think I might be missing something.
#3

[eluser]Developer13[/eluser]
I tend to overthink things at times, and I think that's what is happening here. The solution (well, not necessarily *the* solution, but *a* solution) is extremely simple.

Table name: has_read
Fields: user_id, thread_id, date_last_read

When the forum member views a thread, the above table should be updated with the forum member's user id, the id of the thread they are reading and a timestamp.

If the timestamp in the has_read table is greater than the timestamp of the last post in that thread, then there are no new posts for the user to read.

If the timestamp in the has_read table is less than the timestamp of the last post in that thread, then there are new posts that the user has not yet read.

I'd do it this way, because let's say the thread has 50 posts and the user has not read any of them. Let's say the user clicks on the thread to start reading posts but quickly becomes disinterested on the first page of the thread, so the user discontinues reading the thread and goes elsewhere. If I were that user, I would want that thread marked as "read" -- I wouldn't want it not marked as read because I didn't read through every single message.

Make sense? I think you're on the right track overall.
#4

[eluser]jedd[/eluser]
Yeah, I do like the idea that it's based on thread, rather than messages.

I've been pondering Fuzzy's suggestion of a fixed number of rows, created at user-creation time I guess, that you can use for this purpose - the attractive aspect of that is that you don't end up with (too much) cruft in your db.

I guess the other features that need to be considered is the ability to relatively inexpensively identify who should get notified when a new message pops up in a thread. Each way I look at it, the SQL seems to be non-trivial to do this, mostly because of the presence of cruft I expect. I suspect I'll need a comparably designed table to record thread/user/notify correlations, and tie into that at the same time.
#5

[eluser]Developer13[/eluser]
Personally I don't think creating a fixed number of rows per user should be considered. I don't really get the point of that - just store the data in the one single table when the user reads a thread. KISS! Over time, you can eliminate records in that table based on their age.
#6

[eluser]bas_vdl[/eluser]
other way is easier to build

compare post dates with last visit date.

IF last visit date < post date

mark as unread

ENDIF
#7

[eluser]xwero[/eluser]
I think you have to see this through a forum members eyes.

When should a post be set to read? I guess we all agree when a user clicks to see the post and when the post is submitted by the member. So these post ids go in a table accompanied by the user id and the post date.

Most forums have an all read link/button. This will cause a change in read date for the user and will remove all posts in the read post ids table which have a lower date than the read date set by the all read link.

I think this is the least data consuming and most accurate solution, not?
#8

[eluser]jedd[/eluser]
xwero - would you really correlate this to post_id?

How would you handle the cleaning of that correlation table? I mean, there's 70,000 users on this site, and this is post number 557,042. I know that not all users are active, but it's still a sizeable product.

And when you say:

Quote:This will cause a change in read date for the user and will remove all posts in the read post ids table which have a lower date than the read date set by the all read link.

How does this work when, for example, post #557,042 is in this thread, but post #557,041 is in a different (and presumably as yet unread-by-me thread)?
#9

[eluser]xwero[/eluser]
The forums consist out of post ids and it's the last added post for a thread that is being displayed. So you have to go to post level. To prevent too much data the read post ids table should have a thread id too so when a member clicks on the post id link the previous post id from the same thread will be removed.

This means you will only have one item for each thread the member clicked on later than the global read date.

A thread link is something extra because a member doesn't want to see the topic but the last post in the thread. I almost never click on the thread link, i click on the link with the date on the CI forum, i think many people do the same.
If you provide an rss feed there will not be a thread link i think.

If you have a thread link you need to check the click date with the date of the last post and then add that to the read post ids table.
#10

[eluser]TheFuzzy0ne[/eluser]
To expand on my suggestion, to implement a "mark all forums/threads as read", your script would simply do the following:

If a user is marking all forums as read, delete all tracking rows belonging to that user, and insert one row containing marking the most recent post as read. This will mean that only posts that are dated after the date of that row, will show as unread.

If the user is marking all forums as being read in a specific forum - delete all rows for the user with the specified forum id, and insert one row which links to the most recent post in that forum.

Now one thing I overlooked here, is that you'll need to store the forum ID in this table too, then you can work with only the data for any particular forum (or all forums if you wish).

Pruning can be done with each request. Simply order the rows by date, and get the date for row number 250 if it exists (or however many rows you gave to the user), and delete all rows pertaining to that user where the date is older than the selected row. With this method, you won't have any empty rows for the user, and you don't have to earmark any rows when the user is created, so if the account is created but never used, no rows are used. If the user only reads 5 forum posts and never uses the account again, you will only have 5 rows in the table for them (which of course can be cleaned with another periodic process, that removes entries that are say, more than 30 days old.

It sounds simple enough (in theory), and comes at the cost of a few extra database queries per forum request, but I've heard that some/most mainstream forums make in excess of 30 database queries per request (although it would be great if anyone actually has some figures on this).

I hope this makes sense. What I would like to know, however, is the name of any libraries already in existence that do this for you.




Theme © iAndrew 2016 - Forum software by © MyBB