Email addresses with a "+" are not parsed correctly |
[eluser]sophistry[/eluser]
ha ha! wow, thanks for the link to the "impenetrable RFC 2822 email address parser." so, that's how you'd do it if you wanted to be completely impractical and gosh-darned complex! ;-) the set of atext "visible chars" i suggested above could become the CI-approved email address standard which would cover 99% of email addresses anyone would ever really use. the gargantuan email address parser is really dealing mostly with edge cases and quoted literals (which would be a real bugbear in an autolink function). maybe the accepted email chars should go into a config setting like the permitted uri setting? in fact, shouldn't the whole "email detector" regex go into its own function to make it more adaptable/extendable? BTW, there is another problem with the current email detector: it detects a domain with a hostname ( e.g., [email protected] ) but, it puts the hostname in $matches[2] and the domain and tld in $matches[3]. But, with an email address with no hostname $matches[2] has the domain alone and $matches[3] has the tld alone. i know you can easily explode('@',$matches[0]); to get the data but, the function should standardize what it captures in the sub-patterns. so that $matches contains a standard set of captured data. here's some new test code that standardizes $matches: Code: <?php |
Welcome Guest, Not a member yet? Register Sign In |