One of our loyal readers has asked where he can find a good system for Optical Character Recognition. He’s a publisher, and wants to be able to receive manuscripts and then transfer them easily to his computer.
Scanners are used to transfer hardcopy onto your computer. Whether it’s a photo or a document, a scanner uses a bright light and a sensor to collect an image from a flat surface. Before digital photography became the standard, scanners were the best way to get an image onto a computer. Now that digital photography has almost completely elminated the old film style of photography, it can be hard to find a scanner if you just want to scan documents. They are still available, but the selection has largely dried up unless you go to a specialty store. Most computer stores will only stock one or two low-end models.
Quality
One of the most important questions you want to ask is about the quality of the scans. The detail that a scanner is able to produce is measured in DPI, or Dots Per Inch. Just like your printer can produce more detailed documents if it has a higher DPI rating, a scanner can produce more detailed scans if it has a higher DPI.
Most scanners will list an Optical DPI and an Interpolated DPI. The first number is what the physical mechanism is able to detect. The Interpolated DPI means that the software that comes with the scanner can make the image quality even better than what the scanner can do naturally by making some educated guesses. It predicts what dots should fall between the dots it actually sees, and makes the image a little bit better.
All-in-One Printers
As standalone scanners began to become less popular and thus less expensive, printer manufacturers saw a golden opportunity. By integrating a scanner into a printer, suddenly you can fax, scan and photocopy as well as print. Initially this was used as a way to cram more features into high-end printers to make them more attractive. Lately, though, you can find All-in-Ones (also known as MFPs, or Multi-Function Printers) at lower costs than some standalone scanners. This is due to the printer-makers’ tactic of selling inkjet printers for rock-bottom prices to get you hooked on their ink.
So, the best option for a document scanner may be an All-in-One printer. Not only does it have a scanner, but almost all of these units include a sheet-feed mechanism. This means that you’ll be able to plop a multi-page document into the sheet feed, and let your computer churn away converting the entire document into text while you’re off enjoying a refreshing beverage reading a book on the sofa. Beats typing it out by hand.
OCR Software
Once you have a scanner to be able to convert your document into a computer image, you also need software which will turn the computer image into a text document. A good OCR program will do this all in one step, converting as the document scans.
Be sure when you’re buying a scanner or an All-in-One printer that it comes with OCR software (Optical Character Recognition). Virtually every standalone scanner somes with a trial or ‘lite’ version of one of the major OCR packages which will be fine for occasional use. Not all printers that include scanning functions will come with OCR software, though, so be sure that the printer you buy includes the software or you’ll have to buy it separately. You may want to do that, but better to try the free one first and see if it fits your needs.
If you plan on converting documents regularly, you may want to spend money on the full version of an OCR program. These run around $200, which isn’t cheap, but if you consider that one of these programs can turn a 20-page document into text for you in less than 5 minutes, you’ll save time and money in the long run. How many hours would it take you to type out 20 pages? How much would you have to pay someone to do it for you?

Let’s say you needed to re-type five 20-page documents, that would probably take you 12-15 hours of typing. Unless you can find a typist who is 99% accurate for $10 an hour, $200 starts to seem like a wise expenditure.
0 Responses
Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.