Welcome Guest, Not a member yet? Register   Sign In
UK Phone Number Validation (regexp trickle!)
#1

[eluser]Unknown[/eluser]
I just spent a couple of minutes writing a UK phone number validation callback for CI. I looked around and couldn't find one (there are some for the US dotted about -- good luck US people, those numbers look slippery!) so used some javascript from here:

http://www.braemoor.co.uk/software/telnumbers.shtml

I've kept commented out code in there, to help anyone who wants to extend the scope of the callback, though it's probably unnecessary because you can get CI to do most of the work for you.

So you'd have in your form controller something like:

Code:
//Set rules that require numeric only, min length of 9 and field as required.
$this->form_validation->set_rules('phone', 'Phone Number', 'required|numeric|min_length[9]|callback_uk_phone_check');



Then the callback function is:

Code:
/**
    * Super quick UK Phone number checker.
    * Adapted from javascript solution found here:
    * http://www.braemoor.co.uk/software/_private/jstelnumbers.js
    **/
    function uk_phone_check($number)
    {
        $errorarr = array(
            //0 => "Valid UK telephone number",
            //1 => "Telephone number not provided",
            //2 => "Please do not include the country code in the telelphone number",
            //3 => "UK telephone numbers should contain 10 or 11 digits",
            4 => "The telephone number should start with a 0",
            5 => "The telephone number is either invalid or inappropriate"
        );
        
        //Set the error to false, our final bit of logic will check against this
        $telnum_error = false;
        
        // Don't allow country codes to be included (assumes a leading "+")
        //$leading = '/^(\+)[\s]*(.*)$/';
        
        // Check that the first digit is 0
        $first = '/^0[0-9]{9,10}$/';
        
        //Check that the telephone number is valid.
        $valid = '/^(01|02|03|05|070|071|072|073|074|075|07624|077|078|079)[0-9]+$/';
        
        // Array holds the regular expressions for the drama telephone numbers
          // see http://stakeholders.ofcom.org.uk/telecoms/numbering/guidance-tele-no/numbers-for-drama
        $drama = array();

        array_push($drama, '/^(0113|0114|0115|0116|0117|0118|0121|0131|0141|0151|0161)(4960)[0-9]{3}$/',
        '/^02079460[0-9]{3}$/', '/^01914980[0-9]{3}$/','/^02890180[0-9]{3}$/','/^02920180[0-9]{3}$/',
        '/^01632960[0-9]{3}$/','/^07700900[0-9]{3}$/','/^08081570[0-9]{3}$/','/^09098790[0-9]{3}$/',
        '/^03069990[0-9]{3}$/');

        //if (preg_match($leading, $number)) {
            //$telnum_error = 2;
        //} elseif (!preg_match($first, $number)) {
        if (!preg_match($first, $number)) {
            $telnum_error = 4;
        } elseif ( !preg_match($valid, $number) ) {
             $telnum_error = 5;
        } else{        
        
        //Check against the drama numbers
            for ($i=0; $i < count($drama); $i++) {
                if ( preg_match($drama[$i], $number) ) {
                    $telnum_error = 5;
                }
            }
        }

        //See if we've got an error
        if($telnum_error){
            //echo "error >>>>>>>>>>>>>>> " . $errorarr[$telnum_error] . " <<<<<<<<<<<<<<<";
            $this->form_validation->set_message('uk_phone_check', 'The %s number must be a valid UK telephone number<br/><u>' .$number. '</u> --&gt; ' . $errorarr[$telnum_error]);
            return FALSE;
        }else    {
            return TRUE;
        }
    }

Not at all elegant in that it's just a nested if tree (vomit vomit), but it should be useful to extend existing functions. Feel free to comment on ways to improve it and I'll refactor and repost.
#2

[eluser]g1smd[/eluser]
Only very small parts of the 02 and 05 number ranges are in use, and little more in the 03 range. I'd use more specific patterns.
The 071, 072 and 073 ranges are not yet in use in the UK.


Optimising RegEx patterns can have an appreciable impact on performance.

Quote:(01|02|03|05|070|071|072|073|074|075|07624|077|078|079
Having found the 0 once, why keep on finding it again and again?
Likewise for mobiles, having found the 07 once, why find it again and again?
Try:
Code:
(0([123]|5[056]|7([45789]|624)))


Likewise,
Quote:(0113|0114|0115|0116|0117|0118|0121|0131|0141|0151|0161)
having found the 011 for 0113, why have to find it again for 0114 onwards?
Having found the 01 for 0121, why have to find it again for 0131 onwards?
Try:
Code:
(01(1[3-8]|[2-6]1))
or
Code:
(01(1[3-8]|[2-69]1))
including 0191.



Users should be allowed to enter numbers in any format they want to, either in international format with country code or in national format but with an ISO 3166 country identifier for clarification.

The data should then be cleaned:
- remove any access code (+, 00, 011, etc), if present,
- remove the country code, if present, and store it for later use,
- generate the country code from the entered ISO 3166 country code, if present,
- use a default country code for national numbers where no country is specified,
- remove and separately store any extension number details,
- remove all spaces and non-digits from whatever is left,
- remove any leading 0 if present (except for Italy and one or two other countries where it should be left in place),
- check the number length is valid for the initial digit(s), and
- identify the number type from the initial digit(s).
Finally, format the number according to the formating rules for the identified number type and then display it. International format is always preferred.

Separating the validation routines from the formatting routines makes for code that is much easier to maintain.



I have generated a complete set of selection, validation and formatting RegEx patterns for every UK number range and number type. It's too long to reproduce here and it would be difficult to maintain multiple copies. This can be seen at:
http://www.aa-asterisk.org.uk/index.php/...ne_Numbers

The first section of the page shows patterns that can be used to validate the number is likely a GB phone number with a variety of dial prefixes, and for extracting the NSN part of the number.
The second section of the page shows patterns that can be used to identify whether the number is valid and the number type. The patterns work with the previously extracted NSN part.
The third section shows the various patterns used for formatting each number range.

The code works to the formats listed at:
http://www.aa-asterisk.org.uk/index.php/Number_format
noting the complexities detailed in:
http://www.aa-asterisk.org.uk/index.php/01_numbers
http://www.aa-asterisk.org.uk/index.php/02_numbers
http://www.aa-asterisk.org.uk/index.php/Mixed_areas
and other pages.


I'm also the UK metadata editor for the Google libphonenumber project over at:
http://code.google.com/p/libphonenumber/
The xml metadata file there has number length, validation and formatting information for every country.


Additionally, if you're interested in selecting the right length and initial digits for local numbers, details for every UK area code are here:
http://www.aa-asterisk.org.uk/index.php/...ne_numbers





Theme © iAndrew 2016 - Forum software by © MyBB