CodeIgniter Forums
Errors when bot supplies its own session ID? - Printable Version

+- CodeIgniter Forums (https://forum.codeigniter.com)
+-- Forum: Development (https://forum.codeigniter.com/forumdisplay.php?fid=6)
+--- Forum: CodeIgniter 3.x (https://forum.codeigniter.com/forumdisplay.php?fid=17)
+--- Thread: Errors when bot supplies its own session ID? (/showthread.php?tid=65257)

Pages: 1 2


RE: Errors when bot supplies its own session ID? - twpmarketing - 05-24-2016

So even your curl utility doesn't cause the error any more?
"frustration" is coding middle name...

I would test this but I have no live websites running CI 3.

Is your blacklist still active? I fishing for anything here.


RE: Errors when bot supplies its own session ID? - skunkbad - 05-24-2016

Yep, blacklist is still active, and implemented like this in MY_Session.php:


PHP Code:
public function __construct(array $params = [])
{
    /**
     * If request comes with user agent or IP
     * address that is in the session blacklist, 
     * abort session initialization.
     */
    $CI =& get_instance();
    $CI->config->load('session_blacklist');
    $blacklist config_item('session_blacklist');

    foreach$blacklist as $bot )
    {
        if(
            // Check the user agent
            stripos$CI->input->user_agent(), $bot ) !== FALSE OR

            // Check the IP address
            $CI->input->ip_address() == $bot
        
)
        {
            return;
        }
    }

    parent::__construct($params);


This is the actual blacklist array:


PHP Code:
$config['session_blacklist'] = [
    'baiduspider',
    'yandex',
    'fastcrawler',
    'majestic',
    'yahoo! slurp',
    'bingbot',
    'googlebot'
]; 

It for sure does what it's supposed to do.


RE: Errors when bot supplies its own session ID? - twpmarketing - 05-24-2016

Umm... I meant is that why you are not getting any more errors?

I'm assuming that the blacklist trap runs before your error reporting hook. I see that your trap is in the __construct function. Is this running before or after your hook?


RE: Errors when bot supplies its own session ID? - skunkbad - 05-24-2016

No, that's not the reason I'm not getting any more errors, because I'm not adding a user-agent header.

Error reporting hook would be pre-system, so should be called before the session starts.


RE: Errors when bot supplies its own session ID? - twpmarketing - 05-24-2016

Duh, sorry, not thinking today. I hate it when something works as it should, but doesn't give a clue why it changed.

If the robot visits your site, thus triggering a new session, it can read the session ID but still triggers a session test within seconds of the first. I don't know enough about how a typical robot is coded to know why that would happen. A question becomes: is it the robot that is triggering the second session check or is it done by CI?


RE: Errors when bot supplies its own session ID? - skunkbad - 05-24-2016

(05-24-2016, 05:43 PM)twpmarketing Wrote: Duh, sorry, not thinking today.  I hate it when something works as it should, but doesn't give a clue why it changed.

If the robot visits your site, thus triggering a new session, it can read the session ID but still triggers a session test within seconds of the first.  I don't know enough about how a typical robot is coded to know why that would happen.  A question becomes: is it the robot that is triggering the second session check or is it done by CI?

I'll look at the access log if it happens again. Another thing I forgot to mention is that I cleared out all of the session records in the database. I think it's more likely that that has something to do with why the errors stopped than anything else.


RE: Errors when bot supplies its own session ID? - twpmarketing - 05-24-2016

Ok, it's been interesting. If you find a cause, I'd be interested to hear more. I like your robot trap code anyway, thanks for that.


RE: Errors when bot supplies its own session ID? - Narf - 05-24-2016

Do you have sess_match_ip enabled? If you do, and if the bot switches IP addresses while crawling ... that's most likely the problem.

CI does take this into account, but your database table needs to be properly setup as well. The DB error suggests that it only has the "id" column as the primary key, while it needs to be a composite of id, ip_address if you've enabled match_ip.


RE: Errors when bot supplies its own session ID? - skunkbad - 05-25-2016

(05-24-2016, 11:45 PM)Narf Wrote: Do you have sess_match_ip enabled? If you do, and if the bot switches IP addresses while crawling ... that's most likely the problem.

CI does take this into account, but your database table needs to be properly setup as well. The DB error suggests that it only has the "id" column as the primary key, while it needs to be a composite of id, ip_address if you've enabled match_ip.

Dang it. I am guilty. Nows when you scold me about reading the user guide.


RE: Errors when bot supplies its own session ID? - Narf - 05-25-2016

(05-25-2016, 12:49 AM)skunkbad Wrote:
(05-24-2016, 11:45 PM)Narf Wrote: Do you have sess_match_ip enabled? If you do, and if the bot switches IP addresses while crawling ... that's most likely the problem.

CI does take this into account, but your database table needs to be properly setup as well. The DB error suggests that it only has the "id" column as the primary key, while it needs to be a composite of id, ip_address if you've enabled match_ip.

Dang it. I am guilty. Nows when you scold me about reading the user guide.

I'd be annoyed if you reported it as a bug via GitHub, but not here really ... not for this. It's an easy mistake to make and one that I anticipated.