CodeIgniter Forums

Full Version: valid_email litlle bug
You're currently viewing a stripped down version of our content. View the full version with proper formatting.

El Forum

[eluser]Mareshal[/eluser]
I have a problem with "valid_email" because the function validates this email: "[email protected]" as a valid email. is returning TRUE instead of FALSE.

To solve this problem I've added in config.php this code:

$config['tld'] = array('com', 'net', 'biz'); //etc and with valid_email I am checking if "comm" is in my array or not.

El Forum

[eluser]Dam1an[/eluser]
I'm not really sure if this is a bug, as I don't think that function was ever meant to check against valid TLDs, but to check the format is correct, no disallowed chars etc
Also, now that you restrict it to a predefined set of TLDs, chances are you've missed some, or need to manually update it every time a new TLD appears.

And why not take it a step firther and do a whois lookup to check the domain they entered is valid Tongue

El Forum

[eluser]Ihab Khattab[/eluser]
I like this approach using checkdnsrr

http://www.sitepoint.com/article/users-e...dress-php/

I hope to see it implemented in the next CI version

El Forum

[eluser]sophistry[/eluser]
dns check only tells you if the domain has an MX record - it doesn't check the mailbox part. that's a start. also, a DNS check for every email can be pretty resource intensive and it exposes you to network problems during email acquisition.

i wrote some code that fetches the IANA list of TLDs and sort is according to popularity. this lets me prepare an optimized set of TLDs for a regex.
Code:
function tlds()
    {
        $iana_tlds = file_get_contents('http://data.iana.org/TLD/tlds-alpha-by-domain.txt');
        // remove the first line
        $iana_tlds = trim(strstr($iana_tlds, "\n"), "\n");
        // make newlines into spaces
        $iana_tlds = str_replace("\n", ' ', $iana_tlds);
        //p($iana_tlds);
        // reorder domains for better regex
        // according to this article - the guy searched google
        // to get an ordered list of the most popular TLDs
        // http://www.seobythesea.com/?p=94
        $top_100_tlds = strtoupper('com org edu gov uk net ca de jp fr au us ru ch it nl se no es mil fi cn br be at info pl dk cz cl hu nz il ie za tw kr mx gr ar hk in pt sg tr sk ro tv lv biz ua ee th hr lt is nu vu lu my fm si co ni ph cc li bg ae yu md name pk ve ma pw ws eg mk ir id bz sa ba pe kz tc uz ec by cy st to vn ge tk ms am ac cr');
        // array instead of space separated
        $top_100_tlds_array = explode(' ', $top_100_tlds);
        // remove the strings that appear in the top 100
        $bottom_tlds = str_replace($top_100_tlds_array, '', $iana_tlds);
        // remove the XN-- TLDs (these are encoded TLDs -
        // strip them to make regex cleaner)
        $bottom_tlds = preg_replace('#XN--[^ ]+#', '', $bottom_tlds);
        // remove consecutive spaces left over
        // is there a better way to condense spaces?
        $bottom_tlds = implode(' ', array_filter(explode(' ',$bottom_tlds)));
        // maybe this is faster? should test it
        //$bottom_tlds = preg_replace('#\s{2,}#', ' ', $bottom_tlds);
        // standardize format to newline
        // put them at the beginning to optimize regex
        $optimized_iana_tlds = $top_100_tlds.' '.$bottom_tlds;
        // the TLDs have to be followed by a slash or an end of string
        // so add that to the regex recipe
        $regex_or_list = '(' . str_replace(' ','|',$optimized_iana_tlds) . ')($|\/)';
        print_r($regex_or_list);
        
    }

El Forum

[eluser]Michael Wales[/eluser]
.comm is a valid TLD, as is .omgicannislettingpeoplepurchasestheirowntldstodowhatevertheywantwithsoon

Not a bug.

El Forum

[eluser]sophistry[/eluser]
not to nitpick over hypotheticals, but it's a slow morning... Wink

actually, .comm probably wouldn't be accepted:
Quote:Strings must not be confusingly similar to an existing top-level domain or a Reserved
Name
http://losangeles2007.icann.org/files/lo...3oct07.pdf

also, the string in the example is 72 chararcters long - upper limit for TLDs is proposed to be 64 chars.

valid_email() is a misnomer. the function should be called "looks_like_it_might_be_an_email()"

cheers.

El Forum

[eluser]Michael Wales[/eluser]
haha, touche

My point was, valid TLDs aren't just limited to what you see on the frontpage of GoDaddy.com. Truly validating an email is hard and resource intensive.

I like sophistry's suggestion: when you see valid_email() think looks_like_it_might_be_an_email()

El Forum

[eluser]Mareshal[/eluser]
I took major tlds from wikipedia, sites which I can accept on my site:
$config ['tlds'] = array('com', 'net', 'org', 'biz', 'us', 'info', 'name', 'pro', 'aero', 'asia', 'edu', 'mobi', 'tel');

and a good solution is to check MX records of the domain

El Forum

[eluser]jpi[/eluser]
What about .eu ? :p