Regex to remove html tags in a block - Printable Version +- CodeIgniter Forums (https://forum.codeigniter.com) +-- Forum: Archived Discussions (https://forum.codeigniter.com/forumdisplay.php?fid=20) +--- Forum: Archived Development & Programming (https://forum.codeigniter.com/forumdisplay.php?fid=23) +--- Thread: Regex to remove html tags in a block (/showthread.php?tid=27485) |
Regex to remove html tags in a block - El Forum - 02-12-2010 [eluser]bugboy[/eluser] Hi all I'm trying to remove blocks of html from stings of text. Once they are removed i place a marker where they were and then run some code before adding them back in. This regex does it for most html tags apart from image tags <img /> Code: "|<[^>]+>(.*)</[^>]+>|U" I also want to try and remove whole blocks of html from a string. so for example say i have a string that looks like this. This contains links, images and youtube. Code: Vestibulum enim wisi, viverra nec, fringilla in, laoreet vitae, risus. This is <a href="http://example.com/" title="Optional Title Here">title for this link reference-style link. This is a blockquote with two paragraphs. Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aliquam hendrerit mi posuere lectus. This is <a href="http://example.com/" title="Title">an example</a> inline link. Vestibulum enim wisi, viverra nec, fringilla in, laoreet vitae, risus. i would like it to be outputtd like this. Code: Vestibulum enim wisi, viverra nec, fringilla in, laoreet vitae, risus. This is {*0} title for this link reference-style link. This is a blockquote with two paragraphs. Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aliquam hendrerit mi posuere lectus. This is {*1} inline link. Vestibulum enim wisi, viverra nec, fringilla in, laoreet vitae, risus. I can't seem to figure it out i get so far with it and then it breaks. Any help would be greatly appreciated. Thanks for your time Regex to remove html tags in a block - El Forum - 02-12-2010 [eluser]Sbioko[/eluser] Try using this: Code: /\<(.*)\>(.*)\<(.*)\/\>/u Regex to remove html tags in a block - El Forum - 02-12-2010 [eluser]bugboy[/eluser] Cheers for that That didn't seem to work. However this is doing the trick. Can this be optimised? Code: $regex = "(<[^>]+>.+?</[^>]+>|<[^>]+/>)si"; Regex to remove html tags in a block - El Forum - 02-12-2010 [eluser]Sbioko[/eluser] What do you mean? What you need to optimize here? I'm not actually a Regex master, but I know something about it, so I think that s and i should not be here. Code: /<[^>]+>.+?</[^>]+>|<[^>]+/>/u Regex to remove html tags in a block - El Forum - 02-12-2010 [eluser]bugboy[/eluser] ahh your right about the (i) I added the s in to treat it newlines as strings and it does the trick on the youtube code I say optimised as someone may have a more efficient way of doing it that;s less CPU heavy. Not to worry though for the time being its working. Regex to remove html tags in a block - El Forum - 02-12-2010 [eluser]Phil Sturgeon[/eluser] http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 Quote: 2587 votes Regex to remove html tags in a block - El Forum - 02-12-2010 [eluser]Sbioko[/eluser] I don't know is this for you, but try to Google "htmlSQL". Regex to remove html tags in a block - El Forum - 02-12-2010 [eluser]bugboy[/eluser] Well I'm not parsing html as such i'm just removing tags. That code works so i'm happy. Regex to remove html tags in a block - El Forum - 02-12-2010 [eluser]Sbioko[/eluser] What code? :-) htmlSQL? If you want just to remove tags and that's all, just use: Code: strip_tags($html); Regex to remove html tags in a block - El Forum - 02-12-2010 [eluser]bugboy[/eluser] yeah i needed to do more then that. I needed to get the html tags and store them. otherwise strip_tags would of been the way to go. |