Command-line snippets for processing scans

I originally wrote this in 2009 as a cheatsheet for processing black and white scanned sheet music. It’s a collection of tips for converting and processing scanned images to ultimately get a nice, high-quality PDF that you could archive.

The programs convert and mogrify are part of the ImageMagick suite and the program pdfimages is part of the Xpdf suite.

  • To extract all images from BestSongEver.pdf to bse-000.ppm, bse-001.ppm, bse-002.ppm, … (color images end with ppm, b/w images end with pbm)
    $ pdfimages BestSongEver.pdf bse
  • To convert every *.pbm image to TIFF using Group4 compression with a nominal resolution of 300pixels/inch
    $ mogrify -format tiff -compress Group4 -density 300 *.pbm
  • To convert a series of TIFF images, in alphabetical order, into one PDF document
    $ convert *.tiff MyNewDocument.pdf
  • To convert a series of TIFF images, applying group4 compression and a fixed DPI, into one PDF document
    $ convert *.tiff -compress Group4 -density 300 MyNewDocument.pdf
  • To convert all TIFF images in-place to GRAYSCALE
    $ mogrify -type Grayscale *.tiff
  • To convert all TIFF images in-place to BLACK and WHITE
    $ mogrify -type Bilevel *.tiff
  • To convert a multi-page TIFF document into a multi-page PDF
    $ tiff2pdf MyDocument.tiff -o MyDocument.pdf
  • To concatenate a series of TIFF images into a single multi-page TIFF
    $ tiffcp page*.tiff MyNewDocument.tiff
  • To split a two-page scan original.tiff down the middle into page-0.tiff and page-1.tiff
    $ convert original.tiff -crop 50%x100% page.tiff

Leave a Reply