Scanners have been around for a long time, and today’s scanners are cheap, are fast, and produce high-quality output. Still, people haven’t figured how to make good scans—just take a look at scanned scores, manga, etc.—from their scanners.
This means the difference between:
Get a good scanner
Get a scanner that produces good output, comes with good software, and scans relatively fast. You don’t want to wait around forever when you’re scanning many pages. Even better, get one with a relatively large scanning bed. Many formats are larger than letter paper, and you don’t want to cut off part of the image or have to stitch images together.
Use good scan settings
- In your scanner software, set it to scan at a high-resolution (At least 300 ppi is the best for documents with detail. 200 ppi might suffice for simpler documents).
- For document type, choose either color or grayscale. It’s often better to convert the images to black and white later than at the scanning stage. It’s a one-way street. You might have to experiment a little with this: For one scanner I had, scanning in color and then converting to grayscale was better than scanning in grayscale to begin with.
- Choose a lossless file format. If your software supports it, choose PNG. Otherwise, choose TIFF or BMP. Avoid JPEG when dealing with documents: scanning software tend to produce poor-quality JPEGs, and it’s easy to accidentally save a grayscale JPEG as a color JPG. Not using JPG will avoid those oh-so-attractive blocky-blurry fuzzies around everything.
- Don’t use auto-crop. Manually set the scan size to the actual size of the paper before you scan. This will help produce consistently-sized images.
- Make sure you know where the coordinate origin is. One corner of the flatbed will have an arrow indicating the upper-left boundary. Place the paper so that it perfectly fits into that corner. When scanning thick books, make sure you pay close attention to where the scanned page is, since the cover will move around.
- Scanner lids are detachable. If you’re scanning a thick book, detach the lid.
- Hold down the book! Either put the lid on the book and press down on the lid, or just press down on the book with your hand. This helps with printing near the spine and for pages that are bent or creased. Keeping the entire page flat is always better than having a large ugly gradient on one side.
Processing the files
- Keep all your source images! If you mess something up in post-processing, it is more convenient to simply start again with your source file, rather than re-scanning the page.
- Use consistent naming! Name your pages something like page01.png, page02.png, page03.png, etc. This will save much time.
- Convert all of your images to grayscale PNGs if they are not so already. Smart people can run
mogrify -format PNG -type Grayscale page*.pngif the pages are named page01.png, page02.png, etc.
- Rotate the image if needed.
- Adjust your levels. You want to make the background solid white, not gray, and the black text and lines to be perfectly black, without destroying the image. Set your levels by using this guide:
The histogram represents how much of each color is in the image. Move the black slider to the peak (or slightly more to the right) that represents the majority of the black. Move the white past the large peak that represents the background white, so that almost all of it is cut off.
- If you are using Adobe Photoshop, you can save the level adjustment by using the “Save” button in the Levels dialog. If you are using GIMP, your last used levels adjustment is saved and dated in the drop-down menu labeled “Presets.” Since you’re (hopefully) producing consistent scans, this will consistently and easily adjust the levels every time.
GIMP Levels dialog:
- Even better, if you’re using Adobe Photoshop, you can automate these actions. Open an unprocessed image. In the Actions window, make a new action. Photoshop will begin recording your actions. Do all your processing normally, and then hit the stop button in the Actions window. Now, close that file and open up all of your unprocessed files. Go to File→Automate→Batch…, choose your action, set it to run on your Opened Images, and let it get to work. Make sure it’s saving the files in the right place.
Compiling the Images
- If you’re smart, you’ll already have ImageMagick (try running “convert –version” to check). This suite of command-line tools (including the aforementioned “mogrify”) makes image processing easy. If not, you should download and install ImageMagick. Teaching you how to use your computer is beyond the scope of this article.
- This is the ideal scenario: You have a book scanned and processed as a series of 8-bit grayscale PNGs named book00.png, book01.png, book02.png, and so on. To create a nice PDF of this, simply run “convert book*.png book.pdf”. You are now a winner. (Note: older versions of ImageMagick produce broken PDFs. If you are unable to open your PDF in Adobe Reader, upgrade ImageMagick.)
- I haven’t tried this, but it is theoretically possible to install the PDFCreator printer driver and then use Windows Picture and Fax Viewer to print out all the images (at once) as a “Full page fax print” to PDFCreator. However, this will introduce extra margins and ignore your actual image size.