• 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Errors when bot supplies its own session ID?

#11
So even your curl utility doesn't cause the error any more?
"frustration" is coding middle name...

I would test this but I have no live websites running CI 3.

Is your blacklist still active? I fishing for anything here.
CI 3.1 Kubuntu 19.04 Apache 5.x  Mysql 5.x PHP 5.x PHP 7.x
Remember: Obfuscation is a bad thing.
Clarity is desirable over Brevity every time.
Reply

#12
Yep, blacklist is still active, and implemented like this in MY_Session.php:


PHP Code:
public function __construct(array $params = [])
{
    /**
     * If request comes with user agent or IP
     * address that is in the session blacklist, 
     * abort session initialization.
     */
    $CI =& get_instance();
    $CI->config->load('session_blacklist');
    $blacklist config_item('session_blacklist');

    foreach$blacklist as $bot )
    {
        if(
            // Check the user agent
            stripos$CI->input->user_agent(), $bot ) !== FALSE OR

            // Check the IP address
            $CI->input->ip_address() == $bot
        
)
        {
            return;
        }
    }

    parent::__construct($params);


This is the actual blacklist array:


PHP Code:
$config['session_blacklist'] = [
    'baiduspider',
    'yandex',
    'fastcrawler',
    'majestic',
    'yahoo! slurp',
    'bingbot',
    'googlebot'
]; 

It for sure does what it's supposed to do.
Reply

#13
Umm... I meant is that why you are not getting any more errors?

I'm assuming that the blacklist trap runs before your error reporting hook. I see that your trap is in the __construct function. Is this running before or after your hook?
CI 3.1 Kubuntu 19.04 Apache 5.x  Mysql 5.x PHP 5.x PHP 7.x
Remember: Obfuscation is a bad thing.
Clarity is desirable over Brevity every time.
Reply

#14
No, that's not the reason I'm not getting any more errors, because I'm not adding a user-agent header.

Error reporting hook would be pre-system, so should be called before the session starts.
Reply

#15
Duh, sorry, not thinking today. I hate it when something works as it should, but doesn't give a clue why it changed.

If the robot visits your site, thus triggering a new session, it can read the session ID but still triggers a session test within seconds of the first. I don't know enough about how a typical robot is coded to know why that would happen. A question becomes: is it the robot that is triggering the second session check or is it done by CI?
CI 3.1 Kubuntu 19.04 Apache 5.x  Mysql 5.x PHP 5.x PHP 7.x
Remember: Obfuscation is a bad thing.
Clarity is desirable over Brevity every time.
Reply

#16
(05-24-2016, 05:43 PM)twpmarketing Wrote: Duh, sorry, not thinking today.  I hate it when something works as it should, but doesn't give a clue why it changed.

If the robot visits your site, thus triggering a new session, it can read the session ID but still triggers a session test within seconds of the first.  I don't know enough about how a typical robot is coded to know why that would happen.  A question becomes: is it the robot that is triggering the second session check or is it done by CI?

I'll look at the access log if it happens again. Another thing I forgot to mention is that I cleared out all of the session records in the database. I think it's more likely that that has something to do with why the errors stopped than anything else.
Reply

#17
Ok, it's been interesting. If you find a cause, I'd be interested to hear more. I like your robot trap code anyway, thanks for that.
CI 3.1 Kubuntu 19.04 Apache 5.x  Mysql 5.x PHP 5.x PHP 7.x
Remember: Obfuscation is a bad thing.
Clarity is desirable over Brevity every time.
Reply

#18
Do you have sess_match_ip enabled? If you do, and if the bot switches IP addresses while crawling ... that's most likely the problem.

CI does take this into account, but your database table needs to be properly setup as well. The DB error suggests that it only has the "id" column as the primary key, while it needs to be a composite of id, ip_address if you've enabled match_ip.
Reply

#19
(05-24-2016, 11:45 PM)Narf Wrote: Do you have sess_match_ip enabled? If you do, and if the bot switches IP addresses while crawling ... that's most likely the problem.

CI does take this into account, but your database table needs to be properly setup as well. The DB error suggests that it only has the "id" column as the primary key, while it needs to be a composite of id, ip_address if you've enabled match_ip.

Dang it. I am guilty. Nows when you scold me about reading the user guide.
Reply

#20
(05-25-2016, 12:49 AM)skunkbad Wrote:
(05-24-2016, 11:45 PM)Narf Wrote: Do you have sess_match_ip enabled? If you do, and if the bot switches IP addresses while crawling ... that's most likely the problem.

CI does take this into account, but your database table needs to be properly setup as well. The DB error suggests that it only has the "id" column as the primary key, while it needs to be a composite of id, ip_address if you've enabled match_ip.

Dang it. I am guilty. Nows when you scold me about reading the user guide.

I'd be annoyed if you reported it as a bug via GitHub, but not here really ... not for this. It's an easy mistake to make and one that I anticipated.
Reply


Digg   Delicious   Reddit   Facebook   Twitter   StumbleUpon  


  Theme © 2014 iAndrew  
Powered By MyBB, © 2002-2021 MyBB Group.