Welcome Guest, Not a member yet? Register   Sign In
I could use help with RSS/XML, character sets/encodings, and how to deal with copy/paste from Word. Not CI-related.
#1

[eluser]Kinsbane[/eluser]
So, I've had this problem for quite a while now and after extensively search Google I haven't found anyone else with this problem who has posted a solution.

I'm trying to make a valid RSS feed for my company's different types of press releases. When I look at the raw RSS feed with Firefox, the different press releases don't have any line breaks, like how the PR is seen on the normal webpage.

I also ran the RSS feed through the validator, and have numerous errors, most of which pertain to illegal characters or entities, like this:
Quote:'utf8' codec can't decode byte 0x84 in position 25415: unexpected code byte (maybe a high-bit character?)

What functions are available to me to fix this? Keep in mind, these PR's are copy/pasted directly from Word files into the webpage form and then saved to the database. I have asked and asked and asked and asked that our PR firm do NOT do this when posting PR's to the website, but my requests get ignored - I need to be able to do this automatically. What encoding should the database table fields be to help facilitate character encoding at every level?

Thus far I have not been able to find anything on the web that tells how to deal with text copy/pasted from Word, or how to go about making sure my feed validates. It's as if everyone's got inside knowledge of this except for me, and I honestly don't know where to begin looking for answers.

What kind of solutions has everyone else developed? How have you handled character sets and encodings? Do you use UTF-8? ISO-8859-1 ? Thanks for any advice in advance.




Theme © iAndrew 2016 - Forum software by © MyBB