• 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
php grab http or www links from text and return full url

#1
[eluser]new_igniter[/eluser]
I feel pretty dumb on this one. Can anyone help me with a php function to grab a url out of a text blob,
ex: "I am going to be attending the seminar at www.functionplace.com?id=1324354 on wed"
and the function would grab the FULL url and return "www.functionplace.com?id=1324354"

Thanks so much!

#2
[eluser]Randy Casburn[/eluser]
Interesting. Is this free text uncontrolled? My point is this: you need to determine what rules you'll use as delimiters for your search whether that's via regex or strstr() or whatever.

For example I can easily grab your example string:

www.functionplace.com?id=1324354

but miss your requirement totally for something like:

http://functionplace.com?id=1324354&rabbit=jumpedOutOfAHat

Just a thought,

Randy

#3
[eluser]crumpet[/eluser]
Maybe start by searching blob for a string with no space that starts with www. or http://. For more agressive recognition you could search for .com .org .net anywhere in the string. I would probably do more validation on the results you get form those searches - depends what you are going to do with the urls once you find them.

#4
[eluser]new_igniter[/eluser]
do you have regex on hand that you have used?

#5
[eluser]new_igniter[/eluser]
just something that can start when it sees a http or a www and ends when it gets a space.

#6
[eluser]Randy Casburn[/eluser]
Here you go..

Code:
([\\w+?\\.\\w+])+([a-zA-Z0-9\\~\\!\\@\\#\\$\\%\\^\\&\\*\\(\\)_\\-\\=\\+\\\\\\/\\?\\.\\:\\;\\'\\,]*)?

This is from this web site: http://weblogs.asp.net/farazshahkhan/arc...-link.aspx

Hope it works for you,

Randy

#7
[eluser]new_igniter[/eluser]
would you mind showing me how I would call that? I am very new to this, and I appreciate your time. Do I use preg_match or preg_match_all?

#8
[eluser]Randy Casburn[/eluser]
How about this new_igniter. Just like the rest of us...go tottle off and try one, the other, or both.

Then come back and tell us how things worked out. If it didn't work, tell us what went wrong. We'll be in a much better position to assist then.

After 67 posts you should kinda know the best way to get help here is to help yourself first.

Randy

#9
[eluser]new_igniter[/eluser]
Thanks! good point

ok, here is my code

Code:
$rawText = 'this is text http://www.something.com/ag1287 it';

preg_match_all("/([\\w+?\\.\\w+])+([a-zA-Z0-9\\~\\!\\@\\#\\$\\%\\^\\&\\*\\(\\)_\\-\\=\\+\\\\\\/\\?\\.\\:\\;\\'\\,]*)?/", $rawText,$matches,PREG_PATTERN_ORDER);

print_r($matches);

this gives me something like:
Code:
Array
(
    [0] => Array
        (
            [0] => this
            [1] => tis
            [2] => text
            [3] => http://www.something.com/ag1287
            [4] => it
        )

and so on...

really what I would love it if I just had something that would just return the http://www.something.com/ag1287

#10
[eluser]Randy Casburn[/eluser]
Right....this was my earlier point to you...

if you take off the http:// from your string it will return only:

www.something.com/ag1287

if you put www.something.com/ag1287?id=214365 it will return

www.something.com/ag1287?id=214365

So my initial point stands. The delimiters in free text are important and will determine what you catch and what you don't.

Randy


Digg   Delicious   Reddit   Facebook   Twitter   StumbleUpon  


  Theme © 2014 iAndrew  
Powered By MyBB, © 2002-2021 MyBB Group.