Welcome Guest, Not a member yet? Register   Sign In
Character codes
#1

[eluser]Unknown[/eluser]
Hello. I posted this question in the thread for the RSSParser library, but it hasn't gotten any responses, so I'll try it here.

The RSS XML that I am reading contains ampersand codes, like & #8217; (w/o the space) for an apostrophe. When I check the data read by the RSSParser, though, that code has been converted to a bunch of other characters, like: ’. I don't know if this is a common problem or not. The RSSParser uses PHP's SimpleXMLElement object to parse the XML.

I've tried running the internal representation through htmlspecialchars() and different CI functions, like db->escape(). Unfortunately, the screwed up characters are persisting through my application.

I know that this is probably something I should already know, but I’ve been messing with this for days now and it’s driving me nuts. Any advice on how to deal with this?
#2

[eluser]Derek Allard[/eluser]
Firstly, ensure that you're outputting to UTF-8. Just a thought, but does htmlspecialchars_decode help in this case?
#3

[eluser]Unknown[/eluser]
I was able to solve my issue by using htmlentities(), which encodes characters that htmlspecialchars() doesn't, like the left and right single quote, etc.




Theme © iAndrew 2016 - Forum software by © MyBB