• 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
regular expression help

I have a ton of static links pages I want to put in a database. Trying to figure out regular expressions to do this.

I have 3 parts. the title of the link, the url, and the description.

its formatted in the page like this for each link

<div align="Left"> <b> <font face="Verdana" size="3"> <a href="http://www.air-quality-eng.com" target="new"> Air
Cleaners-Air Quality Engineering </a> </font> </b> <font face="Verdana" size="1">
<p style="margin-left: 0; margin-top: 0; margin-bottom: 10"> <font color="#000000" face="Verdana" size="2"> Manufactures
and distributes air cleaners and filtration systems for commercial, home and
industrial use.</font> </p>

So I need to use preg_match_all I am supposing and having no luck.

Can someone who is a wizard give me a hand, would so save me sometime I don't really have.

How about
preg_match_all("/<a href="(.*?)">/", $html, $matches);

To get link, title and description try something like this:
$pattern = "/<a href=\"(.*?)\".*?&gt;(.*?)</a>.*?<p.*?&gt;(.*?)</font>/i";
$html = str_replace("\r\n","", $html);
preg_match_all($pattern, $html, $matches);
I'm a bit rusty, so I couldn't remember how to make it multi-line (since . won't accept linebreaks) so I just removed all the line-endings and made all expressions non-greedy.

It should work, but I have only tested it on text in the e editor.

Thanks to both of you.


I'm getting this error from your code.

Warning: preg_match_all(): Unknown modifier 'a'

the pattern has to be changed from this:
$pattern = "/<a href=\"(.*?)\".*?&gt;(.*?)</a>.*?<p.*?&gt;(.*?)</font>/i";
to this:
$pattern = "!<a href=\"(.*?)\".*?&gt;(.*?)</a>.*?<p.*?&gt;(.*?)</font>!i";

i replaced the pattern delimeter forward slash with ! - it can be anything, but using forward slash was confusing the PRCE parser into thinking the pattern was over when it hit the </a> tag embedded in there.


well they both only get the links and not the description. Plus you have to remove the </font> tag or it doesn't grab anything.

I think I have a better understanding though, and will work with what you guys have helped me with.

I appreciate it and will keep you posted.

This forum rocks for help.

Looking fro help to get link description that is between a font tag. There are many links per page so I am doing this. It obviously doesn't work. Can someone please help? All I am getting is the very last instance found and not all of them.

//html to search for:
//<font color="#000000" face="Verdana" size="2"> link descriptions go here</font>
preg_match_all ("|size=\"2\">(.*?)</font>|",
                    $var, &$matches);

$matches = $matches[0];
    $list = array();

    foreach($matches as $var)

there are other font tags in the pages so I need to search for just this specific one.

I'm having so many issues wrapping my head around this stuff.

Digg   Delicious   Reddit   Facebook   Twitter   StumbleUpon  

  Theme © 2014 iAndrew  
Powered By MyBB, © 2002-2020 MyBB Group.