Welcome Guest, Not a member yet? Register   Sign In
Regular Expression Help
#1

[eluser]Eric Barnes[/eluser]
Hi,

I am in need of a little help with regular expressions.

What I am trying to do is use a regular expression to take the http_host and return just the domain name and the extension.

For example take these urls:
http://www.mysite.co.uk/mydir/test/index.php
http://www.mywwwsite.co.uk/index.php
http://subdomain.testsite.com

I need to just return:
mysite.co.uk
mywwwsite.co.uk
testsite.com

Here is what I am currently trying but it doesn't seem to be working:
Code:
$host = $_SERVER['HTTP_HOST'];
preg_match("/^(http:\/\/)?([^\/]+)/i", $host, $matches);
$host = $matches[2];
preg_match("/[^\.\/]+\.[^\.\/]+$/", $host, $matches);
$server = $matches[0];
echo $server;

As you can probably see I am not good at all with regular expressions and this is some code I found online.
#2

[eluser]eggshape[/eluser]
I think this will work for you:

Code:
$pattern = '|^http://[a-z]+\.([a-z]+\..+)/?|';
preg_match($pattern, $_SERVER['HTTP_HOST'], $matches);
// $matches[1] = your desired result ....everything up to the first fwd-slash (if any) and requires at least 1 dot

You don't have to use '/' as a delimiter. I used '|'. In this case, using something else makes it easier to read.

BTW, you can always substr() to remove the 'http://' and then explode() the string using '/' if you have trouble with regexp. You'll then have to search for your string with in_array()...this requires you to know beforehand what to search for.
#3

[eluser]Eric Barnes[/eluser]
Alex,

Thanks for tips. Unfortunately I didn't get it to work however I did find this function after I posted this which seems to be working.
Code:
function extract_domain($url){
    preg_match('@^(?:http://)?([^/]+)@i', $url, $matches);
    $host = $matches[1];

    // get last three segments of host name if country code TLD with sub domain, eg .co.uk
    preg_match('/[^.]+\.[^.]{2,3}\.[^.]{2}$/', $host, $matches);
    if (empty($matches)) {
        // get last two segments of host name if generic TLD
        preg_match('/[^.]+\.[^.]+$/', $host, $matches);
    }
    return $matches[0];
}
#4

[eluser]eggshape[/eluser]
GET OUT! It didn't work? haha. I'm so rusty. Ok I tested this one using your examples. You might find it simpler:

Code:
$pattern = '|^http://(www\.)?(subdomain\.)?([^/]+)|';
   preg_match($pattern, $string, $matches); //$string is one of your examples
  
   // you'll want $matches[3]

or

Code:
$pattern = '|^http://(?:www\.)?(?:subdomain\.)?([^/]+)|';
   preg_match($pattern, $string, $matches); //$string is one of your examples
  
   // you'll want $matches[1]




Theme © iAndrew 2016 - Forum software by © MyBB