To OCR or Not (already done with ScanSnap)

Get help. Get answers. Let others lend you a hand.

Moderator: Mr_Noodle

To OCR or Not (already done with ScanSnap) Thu Dec 29, 2022 10:11 pm • by elizabethp
Hello everyone, I am hoping to find a solution or workaround to this issue to make an easy workflow (I already did a search).

Sometimes I scan my paperwork using ScanSnap (it is Ocr'd), and other documents are saved by download. Next, I want to use Hazel to file the pdf/docs.

Inbox -> HazelOcr -> Move To Folder

I found that if a pdf is already readable that Hazel displays an error and the document loops. Is there a script that will detect if a doc is OCRd or needs OCRd so I can have one folder to drop all documents in if it needs OCRd then it is sent to the folder that Hazel uses a script to make the doc readable pdf and if the doc is already OCRd by ScanSnap then it is sent to folders I want the pdf filed into?

Inbox
If OCR’d -> then file pdf in my filing system.
If needs OCRd -> send to my folder HazelOCR where the pdf is then made readable -> then send file pdf in my filing system.
elizabethp
 
Posts: 2
Joined: Thu Dec 29, 2022 5:03 pm

Try doing a search on the forums. I believe someone did come up with a script to detect whether the PDF has font directives in it as a heuristic to determine if a file was OCRed.

Otherwise, does the OCR software have some option to add a tag to the file when it's done?
Mr_Noodle
Site Admin
 
Posts: 11255
Joined: Sun Sep 03, 2006 1:30 am
Location: New York City

Mr_Noodle wrote:Try doing a search on the forums. I believe someone did come up with a script to detect whether the PDF has font directives in it as a heuristic to determine if a file was OCRed.

Otherwise, does the OCR software have some option to add a tag to the file when it's done?


I am trying to organize so all docs, pdf, etc are dumped into one folder no matter where they originate and then sorted from there. My ScanSnap is set up to OCR on output. I don't know if I can turn the auto OCR off which I hate to do but that might be a workaround if I have to. I'd prefer to leave it as is and add a script to Hazel to detect and sort into a folder accordingly (OCR's, NonOCR).
elizabethp
 
Posts: 2
Joined: Thu Dec 29, 2022 5:03 pm

Re: To OCR or Not (already done with ScanSnap) Fri Jan 06, 2023 5:39 am • by MikeP
I have a similar system and implement conditions by using multiple folders that a file travels through with Hazel rules on each folder to do the next step and move the file on.

So your folders could be something like:
1. OCR
2. Rename and tag

The Hazel rule in 1. does the OCR and moves the file to 2., and things that don't need OCR'ing (ScanSnap, Download folder rules) will just drop their output files into 2.
MikeP
 
Posts: 6
Joined: Sat Feb 28, 2015 10:00 am


Return to Support