Welcome Guest, Not a member yet? Register   Sign In
Generating Thumbnails from PDF's

I'm looking to code my first CI app to handle documents that have been published in PDF format. They need to be easily viewable so I'd like to have thumbnails (.jpg files) for easy scanning and selecting the desired PDF. What I'd like to do is upload a directory individual PDF's or one multipage PDF and have it the app process them/it to generate the thumbnails in .jpg format for quick viewing. Does anyone have any thoughts on how to accomplish this within CI? All ideas welcome.

on mac osx i use /usr/bin/sips to convert single page PDFs.
imagemagick may be able to do it: http://www.imagemagick.org/Usage/formats/#vector
also, FPDI from setasign to split multipage PDFs into single pages.

[quote author="sophistry" date="1183767989"]on mac osx i use /usr/bin/sips to convert single page PDFs.[/quote]
I'm on a mac but that looks like command line access which I'm not all that familiar with. Could this do batch conversions?

[quote author="sophistry" date="1183767989"]imagemagick may be able to do it: http://www.imagemagick.org/Usage/formats/#vector[/quote]
Hmmm... i'll have to look at this.

[quote author="sophistry" date="1183767989"]also, FPDI from setasign to split multipage PDFs into single pages.[/quote]
This looks very interesting since I'd prefer to take a 50-60 page pdf, upload i to the server, parse the pages into their own separate files, with a custom name, then create thumbnails off of those pdf files in .jpg format for displaying as the preview. Does ImageMagick handle quality settings as well?

i haven't actually done PDF conversion with imagemagick so no advice there.

as long as you are not in safemode or otherwise constrained, commandline apps can be called using PHP's exec() command (there are others too).

e.g., echo exec('ls') // syntax may be slightly wrong :-(

try this for sips:
$cmd = "/usr/bin/sips -s format png $pdf_file -s formatOptions high -s dpiHeight 72.0 -s dpiWidth 72.0 --out $png_file";
$exec_state = shell_exec($cmd)
it will take the first page of a PDF file and dump a png.

fpdi can extract one page at a time (using template feature) and create a new PDF file.

this should get you started. it's not beautiful but it works. if you improve it or expand it post it up here. also, you'll have to modify (or wrap) your FPDF class to accept CI-style array parameter as a constructor argument. with 60 pages, you might start to get memory errors - it may need a buffer flush in the loop where FPDI is instanced over and over.

finally, this code fragment was taken out of a controller so it assumes a few class properties that aren't specifically created here...

// sorry for the fragment, and the nasty coding style
$pdf_id = 12345;
$extension = 'pdf';
$pdf_file = $pdf_id . ".$extension";
$pdf_dir = "directoryname/$pdf_id";
// loop over the pages and suck them out
// this var is set here since we can only get its true value inside the loop
// set the orientation, units, and dimensions
$params_array = array('P','mm',array(180,270))
// load the FPDI library and pass an array of params
// class has been modified to support passing CI-style params
while ($page_num<=$page_count) {
    // automatically given $this->fpdi by CI,
    // but you can instantiate your own
    $this->{"fpdi".$page_num} =& new fpdi($params_array);
    // relative to root dir
    // talk to FPDI to get the source file and single page
    // this doesn't actually seem to return page count
    // it looks like it is asking the PDF file for it's own page count
    // but it doesn't have it in the "dictionary"
    // so we have to do a workaround and test for errors from FDPF
    $page_count = $this->{"fpdi".$page_num}->setSourceFile("$pdf_dir/$pdf_file")
    // get a filename for the new pdf
    // CI forum stripped the percent sign, sorry!
    $page_num_padded = sprintf('%04d', $page_num)
    $tplidx = $this->{"fpdi".$page_num}->ImportPage($page_num)
    // now talk to FPDF to add a page
    // tell FPDI to use the single page as template
    // output the new single page as a file
    // see FPDF docs for more info on switches here
    // I = inline to browser
    // F = save to file

Sophistry, Thanks for pointing me in the right direction, i hope. (given your name) ;-)

Sophistry - (def.) "the use of fallacious arguments, esp. with the intention of deceiving."

Anyway, do you happen to know anything about Ghostscript and how to use it? I find the docs confusing and am not sure how to get it installed on my host server or whether it requires a server-wide install or if I'm able to install it on just my account.

:cheese: no deception intended. my name's just a friendly warning for the well-read that i may be not as smart as i make myself sound.

i know next to nothing about ghostscript.

[eluser]Jim OHalloran[/eluser]
Last time I did this I used FPDI to split the PDF into individual pages, then used ImageMagic's convert progrem to convert those to JPEGs. Worked well, but PDF's which use CMYK colours will be converted to JPEGs which most browsers can't render (but Photoshop and most other tools will open the JPEGs fine). The workaround for this was to use the PHP GD functions to resize the JPEG generated by convert, and GD will take care of converting the colours for you.


[eluser]Lonely Angel[/eluser]
I can do a resize to a picture of pdf?

Many hosts have GhostScript and ImageMagick installed. Ask your host about it. You might find this thread interesting.


[eluser]Lonely Angel[/eluser]
thank you esra!

Theme © iAndrew 2016 - Forum software by © MyBB