Advice on scanning a load of documents

Get help. Get answers. Let others lend you a hand.

Moderator: Mr_Noodle

Advice on scanning a load of documents Mon Jan 25, 2016 12:08 am • by Chuggett
Hi everyone

I have been really slack over the last couple of years with my filing, but it's all going to change this year.

I have a pretty big box of documents which i need to file away, but before i do, I would like to scan every document. I have single page documents and multiple documents.

I have a scanner with a document feeder. I have PDFPen Pro for my OCRing and also have a Creative Cloud account with access to acrobat DC if i need it. Is it better to scan a bunch of files into one large document and then split the document up into their respective files or is it better to just bite the bullet and scan each document separately.

I have also carefully set up my hazel rules for scanning, moving, renaming and filing.

Any advice before i attempt this behemoth of a job is greatly appreciated.

Thanks a lot
Chris
Chuggett
 
Posts: 30
Joined: Wed Mar 16, 2011 9:17 pm

Re: Advice on scanning a load of documents Mon Jan 25, 2016 11:37 am • by Mr_Noodle
I don't really have any direct advice for the scanning aspect but I do suggest searching the forums here. Also, check the Buzz for articles that may help. Many of the sites linked there may have additional resources to help you.
Mr_Noodle
Site Admin
 
Posts: 11872
Joined: Sun Sep 03, 2006 1:30 am
Location: New York City

Re: Advice on scanning a load of documents Tue Jan 26, 2016 12:29 pm • by MacOCD
Is your scanner a ScanSnap? One thing I'd suggest is comparing different OCR software. I have PDFPen Pro, but I find that ABBYY FineReader for ScanSnap (came bundled with my scanner) does a much more accurate job at OCR than PDFPen Pro, especially with smaller print.

I've also bought FineReader Pro to try with a Hazel script just today, but I'm having issues getting the OCR layer to appear like it does with the free bundled version, and it's far slower. Both ABBYY versions create smaller files than PDFPen does from my tests.

I'd definitely scan each document to a separate file with heavy use of contents matching to auto generate names. Start each file name with the document date in format yyyy-mm-dd. Retrieve the date using a "contents match <custom date>" condition to extract a relevant date from the documents and use "edit Date Pattern", it makes it much easier sorting them chronologically at a later date.

If you do have a ScanSnap:
My personal strategy has been to use 2 profiles, 1 for a pile of single sheet page documents (for both single & 2 sided as detected blank pages can be automatically deleted) and a second for multi-sheet documents, .

Using the single sheet profile you can pile up the single sheet documents and let it get on with it.

For the multi-sheet profile I would load all the pages of each multi-page document in the scanner together and scan one multi-page document at a time. Once the final page has finished scanning I'd load in the next multipage document, I wouldn't bother waiting for the OCR to finish before loading & scanning the next multipage document.

Good luck!
MacOCD
 
Posts: 44
Joined: Fri Sep 26, 2014 11:02 am


Return to Support