Wikipedia Parsing Library |
[eluser]jimtomas[/eluser]
I'm creating a small library to use on my site. I'd like to import a small bit of information (with the proper shout out) from wikipedia based on the US state the visitor is looking to get a job in. I've cobbled together parsers from different sources to make a decent library, but I'm still having trouble filtering out odd bits of strange formatting from wikipedia. Can anyone offer a good open source parser for wikipedia or point me in the right direction? Thanks, Jim
[eluser]jimtomas[/eluser]
To answer my own question a little, I dove in to the api and found a way wikipedia can self parse. It's a bit bizarre and incredibly slow, but combined with cache, this could be my solution until a better(faster) parser is discovered. It goes something like this http://en.wikipedia.org/w/api.php?action...APAGE&text;={{: WIKIPEDIAPAGE}}&format=xml&prop=text for example: http://en.wikipedia.org/w/api.php?action...&prop=text I'll work this out and hopefully contribute some code for others to use.
[eluser]jimtomas[/eluser]
Ok, here is the helper I created, let me know if you have any questions. Certainly add on and make this better if you can. Thanks! Code: <?PHP |
Welcome Guest, Not a member yet? Register Sign In |