Welcome Guest, Not a member yet? Register   Sign In
Shorten HTML content?
#1

[eluser]dmorin[/eluser]
I have an site where each record has an HTML description. Currently, when I want a shorter version, such as the first 45 words, I use a library to convert it to something markdown-like, then shorten it. This has a lot of disadvantages thought.

Does anyone know of a library that will parse the html and generate a shortened version? The goal is to be able to get a shorter version that can be displayed as a summary while maintaining formatting (inline styles). It's really only going to be paragraph and span tags as it's being generated by a rich-text editor. Any thoughts would be appreciated.
#2

[eluser]bretticus[/eluser]
You could use something like the PHP Selector library to get the data you want and generate your own shortened version.

EDIT: I was actually thinking of the phpquery library when I went looking but the former is cool too. Wink
#3

[eluser]dmorin[/eluser]
Yeah, that's similar to how I was planning on doing it manually. Loop over the top level elements and quit when I had enough content. But before I did that, I wanted to make sure someone else hadn't written something already since there are a lot of special situations that need to be accounted for, like stopping in the middle of an ordered list.
#4

[eluser]bretticus[/eluser]
[quote author="dmorin" date="1254005363"]Yeah, that's similar to how I was planning on doing it manually. Loop over the top level elements and quit when I had enough content. But before I did that, I wanted to make sure someone else hadn't written something already since there are a lot of special situations that need to be accounted for, like stopping in the middle of an ordered list.[/quote]

Maybe instead of stopping at x number of chars your logic could look ahead and say, this will be too many characters, stop after the next block element. Of course this would require that you parse the container block element. I know this is theory but I really doubt someone has developed a library to do exactly this (they probably just use strip_tags and get plain text.)

Good luck!




Theme © iAndrew 2016 - Forum software by © MyBB