Welcome Guest, Not a member yet? Register   Sign In
a bot accessed a hidden url
#1

I'm curious and perhaps somebody can help.
I was testing a new controller on the site yesterday using a url which does not appear anywhere on any page "ecoland.com/try/this". Lo and behold, shortly after midnight the same url was called by "bing.com/bingbot.htm". My question is: how could bingbot even be aware of the url. I only implemented it yesterday for a few hours and it never appears on any page so a screenscraper cannot have stumbled across it.
Thanks, Bill
Reply
#2

It is amazing where and how the search engines collect data and combine it. I was at a customers site the other day and was logged into my gmail. We were talking about this stuff so as a test I googled 'where am I' and google maps showed a marker exactly where I was. I was amazed at first but I am guessing it recognized the IP address, associated that IP with the company, and was able to locate me exactly on the map. But how clever is that!

As for your URL, I am guessing that you tested it on a live site, and that you typed the url into a browser, so the url was sent out into the wild, where somewhere it was recorded and collected by BING. Or perhaps you have ads on your site, so an add was delivered to the url, or perhaps you have bing tracking on your site, to the page was pinged over to bing when you visited it.

The power of tracking online by search engines is becoming quite scary if, like me, you fear the might of an unfriendly government and it's powerful listening secret agencies. Yes we have a 'friendly' government at the moment, but who knows what might happen.

I have recently started using https://duckduckgo.com/ and have had a play with the Tor Browser https://www.torproject.org/ which is very good. Not sure it will really make any difference.

Best wishes,

Paul.
Reply
#3

Hi Paul
Very interesting response!
1. It was a live site because that's the only way to really see if it works.
2. No ads, we don't use them.
So as you say, the url went off into the wild and only somebody like Edward Snowden knows what happened after that.
It really is starting to feel as if somebody is looking over my shoulder as I type.
I'll need to be ultra careful when testing the login facility.
Thanks for upsetting my day. Sad
Bill
Reply
#4

Most of the newer web browsers and operating systems have built in Geo Location now.

Some will allow you to turn it off if you want like Windows 10 in settings.
What did you Try? What did you Get? What did you Expect?

Joined CodeIgniter Community 2009.  ( Skype: insitfx )
Reply
#5

If I had mistyped or done a search for the url in internet explorer then I could understand bing picking up on it. But this was a very specific url and up til now, I had assumed this would bring me directly to the location without anybody (nsa excepted) being aware of it. Igorance is bliss!
Bill
Reply
#6

If you are typing into your google chrome url bar, it is also performing a live search. If you typed in your website, the search engine also saved that search and the address. It's pretty normal for search engines to try to index websites that have been searched previously.
Codeigniter is simply one of the tools you need to learn to be a successful developer. Always add more tools to your coding arsenal!
Reply
#7

Way 1 - OS is sending information what you are doing in your case I would focus on Windows sending data samples. When you install windows with express customization this is On by default.

Way 2 - active browser plugins or smart mode On which are connecting Bing.

Way 3 - site JS tracking added (usually its google analytics but bing is an option 2).

... And many many other ways.

For such cases it is good that you have some kind of active protection enabled, that will hide any test pages on your production.
Some times such pages can have security issues and they can compromise your production site.
(it is good and not to dev/test on the production)
Best VPS Hosting : Digital Ocean
Reply
#8

I'm learning real fast from these snippets.
The page was not much more than a skeleton to check responses from the controller so no tracking scripts were involved.

I suspect it is internet explorer (on win 7) because, although google is the default search engine, bing is the only engine to have tried the url (every night since then!) and I ran multiple tests of the url in all the other browsers (on linux and apple) as well and none of their search engines tried the url.
However, another possibility which came to mind in the meantime is that the win 7 installation has mcafee antivirus with the safesite feature enabled and this of course would mean that all requests go via mcafee which makes for another possible leakage point.

I find it amazing that a url which was only used probably about 20 times in the space of a few hours could be seen by a search engine as "interesting". These search engines must lead extremely boring lives!
Well, fortunately I've learned my lesson before it's too late.
Many thanks for the tips.
Bill
Reply
#9

Not necessarily that they find the url 'interesting', but it costs search engines relatively nothing to ping your url a few times and index it so they do it for most websites they can get their hands on.
Codeigniter is simply one of the tools you need to learn to be a successful developer. Always add more tools to your coding arsenal!
Reply
#10

First, the robot.txt file need to update if the problem still persists that time is hard to tackle as crawlers don't respect robots.txt so that problem raising
Reply




Theme © iAndrew 2016 - Forum software by © MyBB