CodeIgniter Forums
Text extraction from docx, doc, excel, ppt, pdf, etc formats - Printable Version

+- CodeIgniter Forums (https://forum.codeigniter.com)
+-- Forum: Archived Discussions (https://forum.codeigniter.com/forumdisplay.php?fid=20)
+--- Forum: Archived Development & Programming (https://forum.codeigniter.com/forumdisplay.php?fid=23)
+--- Thread: Text extraction from docx, doc, excel, ppt, pdf, etc formats (/showthread.php?tid=51588)



Text extraction from docx, doc, excel, ppt, pdf, etc formats - El Forum - 05-09-2012

[eluser]ethereal1m[/eluser]
Dear all,
is there any library that supports text extraction from docx,doc, excel, pdf, etc formats like Apache POI does on Java?

Or should I port Apache POI classes to Codeigniter code?

best regards,
ethereal1m


Text extraction from docx, doc, excel, ppt, pdf, etc formats - El Forum - 05-10-2012

[eluser]weboap[/eluser]
never tried this. but look it up
http://davidwalsh.name/read-pdf-doc-file-php


Text extraction from docx, doc, excel, ppt, pdf, etc formats - El Forum - 05-10-2012

[eluser]ethereal1m[/eluser]
@weboap,
unfortunately the app doesn't read docx format....


Text extraction from docx, doc, excel, ppt, pdf, etc formats - El Forum - 05-10-2012

[eluser]CroNiX[/eluser]
There's a docx library for zend framework, which you can use in CI.


Text extraction from docx, doc, excel, ppt, pdf, etc formats - El Forum - 05-10-2012

[eluser]Samus[/eluser]
[quote author="ethereal1m" date="1336707421"]@weboap,
unfortunately the app doesn't read docx format....[/quote]
it's not codeigniter itself, PHP can't read extensions such as .docx by default, you're always going to need some third party library or something.


Text extraction from docx, doc, excel, ppt, pdf, etc formats - El Forum - 05-11-2012

[eluser]ethereal1m[/eluser]
@samus,
yes, I'm looking that 3rd party library that anybody could recommend one....