Is it possible to intercept external garbage URLs to a search routine instead of 404 page? - Printable Version +- CodeIgniter Forums (https://forum.codeigniter.com) +-- Forum: Archived Discussions (https://forum.codeigniter.com/forumdisplay.php?fid=20) +--- Forum: Archived Libraries & Helpers (https://forum.codeigniter.com/forumdisplay.php?fid=22) +--- Thread: Is it possible to intercept external garbage URLs to a search routine instead of 404 page? (/showthread.php?tid=40504) |
Is it possible to intercept external garbage URLs to a search routine instead of 404 page? - El Forum - 04-10-2011 [eluser]John_Betong_002[/eluser] Google Webmaster Tools is complaining of numerous links returning with "An Error Was Encountered [400]". Edit - start: Just noticed that my original post was truncated What I would like to do is to somehow trap the URL before it fails the routing tests, etc. I would like to use the following code: Code: $bad_chars = array('width','height','=',':','<','>','alt','//', 'etc'); Edit - end: The following is an example which results in application/errors/error_general.php http://website.com/afiles/images/santa-email.jpg" width="100" height="50" alt="image"/></a> </div> <div class="c0 r"><a Is it possible to intercept external garbage URLs to a search routine instead of 404 page? - El Forum - 04-10-2011 [eluser]WanWizard[/eluser] Make sure your images exist? Is it possible to intercept external garbage URLs to a search routine instead of 404 page? - El Forum - 04-11-2011 [eluser]John_Betong_002[/eluser] The images all exist, the problem is the trailing junk. Try appending the junk onto a known URL image on your site and see what happens. I have just tried appending the junk onto your avatar and the response I get is "Oops! This link appears to be broken." http://ellislab.com/images/avatars/uploads/avatar_78055.jpgOops! This link appears to be broken. Is it possible to intercept external garbage URLs to a search routine instead of 404 page? - El Forum - 04-11-2011 [eluser]InsiteFX[/eluser] The problem comes form the href content which starts with a single quote but is erroneously closed with a double quote - so it's not actually closed until another single quote is found further down. So all of http://www.snapshotjourneys.com/uploads/images/BORNEO/borneo-kota-kinabalu-malaysia/borneo-kota-kinabalu-malaysia-3-university.jpg" width="81" height="50" alt="image"/></a> </div> <div class="c0 r"><a Is it possible to intercept external garbage URLs to a search routine instead of 404 page? - El Forum - 04-11-2011 [eluser]John_Betong_002[/eluser] Just updated my original post to include the requirements which were truncated. Is it possible to intercept external garbage URLs to a search routine instead of 404 page? - El Forum - 04-11-2011 [eluser]WanWizard[/eluser] I still don't see how you can get into this situation other than invalid HTML or invalid links. Which is your problem as a developer, and you should fix that, not work around it. Is it possible to intercept external garbage URLs to a search routine instead of 404 page? - El Forum - 04-11-2011 [eluser]John_Betong_002[/eluser] I got into this situation by other webmasters using incorrect hotlinks. I have no control over these other sites but it appears I am being penalised by Google for not having corresponding landing pages for the bad URLs. Here are Google Webmaster Tools's first two from eighteen web sites that have invalid links: http://ezentials.com/eqk-7-days-before-santa-rfc.html http://fivestarsmarketplace.com/lov-diagram-santa-pictures-printables.html Search the source code for "afiles/images" and as you will see the first part of the image URLs is correct but the complete URL is invalid. I was hoping to find a way to test the URL before CI routed the URL to an error page. This would also be ideal for filtering all the other hotlinked images. Is it possible to intercept external garbage URLs to a search routine instead of 404 page? - El Forum - 04-11-2011 [eluser]WanWizard[/eluser] Ok. So this is about other sites linking to your site? Then instead of a standard CI 404, route to a 404 controller returning a 200 status, and displays a 404 page with links to important parts of your application. Is it possible to intercept external garbage URLs to a search routine instead of 404 page? - El Forum - 04-11-2011 [eluser]John_Betong_002[/eluser] [quote author="WanWizard" date="1302561152"] Ok. So this is about other sites linking to your site? Then instead of a standard CI 404, route to a 404 controller returning a 200 status, and displays a 404 page with links to important parts of your application. [/quote] Ah the "penny has dropped". I was curious to know why my code was being ignored in the /application/errors/error_general.php. I will try remming the "header("HTTP/1.1 404 Not Found");" script and report back tomorrow... now it is way past my bed time Many thanks. Is it possible to intercept external garbage URLs to a search routine instead of 404 page? - El Forum - 04-12-2011 [eluser]John_Betong_002[/eluser] Nearly there but cannot get both conditions to work together. What I would like to do is to somehow trap the external URL before it fails the routing tests, etc. The following .htaccess in the images folder is supposed to: 1. accept image links from my own site 2. intercept all external links and divert to an ./images/index.php (where URL is parsed and routed to a search routine). .htacees Code: RewriteEngine on ./images/index.php Code: <?php |