General Tips & Tricks

Extract All Images from a pdf

Install Xpdf for your system. For windows, you can find a the download here ftp://ftp.foolabs.com/pub/xpdf/xpdf-3.02pl4-win32.zip

Copy the files to a directory of your choice and add the commands to your PATH.

Open the command line and enter

pdfimages input.pdf output

This will extract ALL images from input.pdf and save them in as PPM files output-001.ppm, outpu-002.ppm, etc. PPM is a highly inefficient format, see this wikipedia entry), so if you want to have nice files, you can use either

pdfimage -j input.pdf output

and get JPEG images, or use ImageMagick and convert the PPM images to PNG with

mogrify -format png output*

You can guess for yourself what the command

pdftotext input.pdf output.txt

does…

adapted from Stefaan Lippens