Determine if a PDF needs to be OCR'd & Automate FineReader

From your noodle to other noodles. Talk about ways to get the most from Hazel. Even exchange recipes for the cool rules you've thought up. DO NOT POST YOUR QUESTIONS HERE.

Moderators: Mr_Noodle, Moderators

There is an error in your script. I suggest debugging it outside of Hazel to get it working.
Mr_Noodle
Site Admin
 
Posts: 11195
Joined: Sun Sep 03, 2006 1:30 am
Location: New York City

Mr_Noodle wrote:There is an error in your script. I suggest debugging it outside of Hazel to get it working.


I have no idea how to do that. I just copied and pasted the code from the first post (the second box of code). Here is the code I have:

Code: Select all
#! /bin/bash
if ! grep Font "$1"
then
        # Open the file in ABBYY's FineReader
   open -a 'Scan to Searchable PDF.app' "$1"
        # Wait around until the file is done being processed by testing to see if a file named "yourfile processed by FineReader" exists.
   while [ ! -e "${1%.pdf} processed by FineReader.pdf" ]; do
      sleep 5
   done
   sleep 5
        # Overwrite the original, un-OCR'd file with the new, OCR'd file
   mv -f "${1%.pdf} processed by FineReader.pdf" "$1"
        # Use Applescript to tell the system to hide ABBYY's FineReader
   osascript -e 'tell application "System Events" to set visible of process "Scan to Searchable PDF" to false'
fi
Phantom27
 
Posts: 13
Joined: Mon Sep 09, 2013 11:08 pm

Turn on debug mode (as described in the sticky in the support forum) and post the output.
Mr_Noodle
Site Admin
 
Posts: 11195
Joined: Sun Sep 03, 2006 1:30 am
Location: New York City

Mr_Noodle wrote:Turn on debug mode (as described in the sticky in the support forum) and post the output.


Ok, this is huge, but here you go. (I only went back to when I turned on the Debug Mode.)
I was working with a file called "Alamosa 2013 Officers.pdf", if that helps.

Code: Select all
2013-09-11 15:32:17.762 hazelworker[8718] DEBUG: Program is licensed.
2013-09-11 15:32:17.804 hazelworker[8718] DEBUG: No description for disk with path: /home
2013-09-11 15:32:17.805 hazelworker[8718] DEBUG: No description for disk with path: /net
2013-09-11 15:32:17.927 hazelworker[8718] DEBUG: Unexpected type for Mail download URL: (null)
2013-09-11 15:32:17.969 hazelworker[8718] DEBUG: Could not scan past browser.download.dir preference
2013-09-11 15:32:18.081 hazelworker[8718] DEBUG: Could not find entry for default_directory in Chrome preference file.
2013-09-11 15:32:18.094 hazelworker[8718] Processing folder Action
2013-09-11 15:32:18.094 hazelworker[8718] DEBUG: Initialized
2013-09-11 15:32:18.094 hazelworker[8718] DEBUG: Pausing to wait for things to settle down.
2013-09-11 15:32:20.095 hazelworker[8718] DEBUG: Processing directories: (
    "/Users/phantom27/Google Drive/Action "
)
2013-09-11 15:32:20.097 hazelworker[8718] DEBUG: About to process directory /Users/phantom27/Google Drive/Action
2013-09-11 15:32:20.101 hazelworker[8718] DEBUG: Received file event: {
    date = "2013-09-11 21:32:17 +0000";
    path = "<ComNoodlesoft_NoodlePathSet: 0x7fcdaac30eb0> - (\n    \"/Users/phantom27/Google Drive/Action /Auto OCR Files\"\n)";
}
2013-09-11 15:32:20.105 hazelworker[8718] DEBUG: .DS_Store: File is hidden/invisible. Skipping.
2013-09-11 15:32:20.106 hazelworker[8718] DEBUG: .pdf: File is hidden/invisible. Skipping.
2013-09-11 15:32:20.107 hazelworker[8718] DEBUG: Skipped /Users/phantom27/Google Drive/Action /[PAUL IS CHECKING TO SEE IF INSURANCE WILL PAY] 2013-08-12 - Maintenance - Steffens Plumbing - Inv_7324_2632.pdf. File/folder not part of change set/file event.
2013-09-11 15:32:20.107 hazelworker[8718] DEBUG: Skipped /Users/phantom27/Google Drive/Action /Adobe EchoSign Text Tag Documentation.pdf. File/folder not part of change set/file event.
2013-09-11 15:32:20.206 hazelworker[8718] DEBUG: Auto OCR Files: Did not match any rules.
2013-09-11 15:32:20.209 hazelworker[8718] DEBUG: Skipped /Users/phantom27/Google Drive/Action /Fire Sprinkler Repair - M&M Fire.pdf. File/folder not part of change set/file event.
2013-09-11 15:32:20.210 hazelworker[8718] DEBUG: Skipped /Users/phantom27/Google Drive/Action /Insurance - 608 Berkley.pdf. File/folder not part of change set/file event.
2013-09-11 15:32:20.210 hazelworker[8718] DEBUG: Skipped /Users/phantom27/Google Drive/Action /Keep. File/folder not part of change set/file event.
2013-09-11 15:32:20.211 hazelworker[8718] DEBUG: Skipped /Users/phantom27/Google Drive/Action /Management Agreement - Dubay.pdf. File/folder not part of change set/file event.
2013-09-11 15:32:20.211 hazelworker[8718] DEBUG: Writing out DB file: /Users/phantom27/Google Drive/Action
2013-09-11 15:32:20.213 hazelworker[8718] DEBUG: Directory /Users/phantom27/Google Drive/Action  processed in 0.116196 seconds
2013-09-11 15:32:20.214 hazelworker[8718] DEBUG: Sending metrics to scheduler. Next scheduled run: 5828963-12-19 17:00:00.000
2013-09-11 15:32:20.216 hazelworker[8718] Done processing folder Action
2013-09-11 15:32:39.817 hazelworker[8725] DEBUG: Program is licensed.
2013-09-11 15:32:39.837 hazelworker[8725] DEBUG: No description for disk with path: /home
2013-09-11 15:32:39.837 hazelworker[8725] DEBUG: No description for disk with path: /net
2013-09-11 15:32:39.940 hazelworker[8725] DEBUG: Unexpected type for Mail download URL: (null)
2013-09-11 15:32:39.980 hazelworker[8725] DEBUG: Could not scan past browser.download.dir preference
2013-09-11 15:32:40.091 hazelworker[8725] DEBUG: Could not find entry for default_directory in Chrome preference file.
2013-09-11 15:32:40.105 hazelworker[8725] Processing folder Action
2013-09-11 15:32:40.105 hazelworker[8725] DEBUG: Initialized
2013-09-11 15:32:40.105 hazelworker[8725] DEBUG: Pausing to wait for things to settle down.
2013-09-11 15:32:42.105 hazelworker[8725] DEBUG: Processing directories: (
    "/Users/phantom27/Google Drive/Action "
)
2013-09-11 15:32:42.107 hazelworker[8725] DEBUG: About to process directory /Users/phantom27/Google Drive/Action
2013-09-11 15:32:42.111 hazelworker[8725] DEBUG: Received file event: {
    date = "2013-09-11 21:32:39 +0000";
    path = "<ComNoodlesoft_NoodlePathSet: 0x7f8a70c4a0b0> - (\n    \"/Users/phantom27/Google Drive/Action /Auto OCR Files\"\n)";
}
2013-09-11 15:32:42.114 hazelworker[8725] DEBUG: .DS_Store: File is hidden/invisible. Skipping.
2013-09-11 15:32:42.114 hazelworker[8725] DEBUG: .pdf: File is hidden/invisible. Skipping.
2013-09-11 15:32:42.115 hazelworker[8725] DEBUG: Skipped /Users/phantom27/Google Drive/Action /[PAUL IS CHECKING TO SEE IF INSURANCE WILL PAY] 2013-08-12 - Maintenance - Steffens Plumbing - Inv_7324_2632.pdf. File/folder not part of change set/file event.
2013-09-11 15:32:42.116 hazelworker[8725] DEBUG: Skipped /Users/phantom27/Google Drive/Action /Adobe EchoSign Text Tag Documentation.pdf. File/folder not part of change set/file event.
2013-09-11 15:32:42.217 hazelworker[8725] DEBUG: Auto OCR Files: Did not match any rules.
2013-09-11 15:32:42.220 hazelworker[8725] DEBUG: Skipped /Users/phantom27/Google Drive/Action /Fire Sprinkler Repair - M&M Fire.pdf. File/folder not part of change set/file event.
2013-09-11 15:32:42.221 hazelworker[8725] DEBUG: Skipped /Users/phantom27/Google Drive/Action /Insurance - 608 Berkley.pdf. File/folder not part of change set/file event.
2013-09-11 15:32:42.221 hazelworker[8725] DEBUG: Skipped /Users/phantom27/Google Drive/Action /Keep. File/folder not part of change set/file event.
2013-09-11 15:32:42.222 hazelworker[8725] DEBUG: Skipped /Users/phantom27/Google Drive/Action /Management Agreement - Dubay.pdf. File/folder not part of change set/file event.
2013-09-11 15:32:42.222 hazelworker[8725] DEBUG: Writing out DB file: /Users/phantom27/Google Drive/Action
2013-09-11 15:32:42.244 hazelworker[8725] DEBUG: Directory /Users/phantom27/Google Drive/Action  processed in 0.136041 seconds
2013-09-11 15:32:42.245 hazelworker[8725] DEBUG: Sending metrics to scheduler. Next scheduled run: 5828963-12-19 17:00:00.000
2013-09-11 15:32:42.246 hazelworker[8725] Done processing folder Action
2013-09-11 15:32:45.566 hazelworker[8731] DEBUG: Program is licensed.
2013-09-11 15:32:45.580 hazelworker[8731] DEBUG: No description for disk with path: /home
2013-09-11 15:32:45.580 hazelworker[8731] DEBUG: No description for disk with path: /net
2013-09-11 15:32:45.641 hazelworker[8731] DEBUG: Unexpected type for Mail download URL: (null)
2013-09-11 15:32:45.643 hazelworker[8731] DEBUG: Could not scan past browser.download.dir preference
2013-09-11 15:32:45.766 hazelworker[8731] DEBUG: Could not find entry for default_directory in Chrome preference file.
2013-09-11 15:32:45.777 hazelworker[8731] Processing folder Action
2013-09-11 15:32:45.777 hazelworker[8731] DEBUG: Initialized
2013-09-11 15:32:45.777 hazelworker[8731] DEBUG: Pausing to wait for things to settle down.
2013-09-11 15:32:46.856 hazelworker[8693] [Error] Shell script failed: Error processing shell script on file /Users/phantom27/Google Drive/Action /Auto OCR Files/Alamosa 2013 Officers.pdf.
2013-09-11 15:32:46.857 hazelworker[8693] Shellscript exited with non-successful status code: 1
2013-09-11 15:32:47.778 hazelworker[8731] DEBUG: Processing directories: (
    "/Users/phantom27/Google Drive/Action "
)
2013-09-11 15:32:47.779 hazelworker[8731] DEBUG: About to process directory /Users/phantom27/Google Drive/Action
2013-09-11 15:32:47.782 hazelworker[8731] DEBUG: Received file event: {
    date = "2013-09-11 21:32:46 +0000";
    path = "<ComNoodlesoft_NoodlePathSet: 0x7ffe3922df90> - (\n    \"/Users/phantom27/Google Drive/Action /Auto OCR Files\"\n)";
}
2013-09-11 15:32:47.785 hazelworker[8731] DEBUG: .DS_Store: File is hidden/invisible. Skipping.
2013-09-11 15:32:47.785 hazelworker[8731] DEBUG: .pdf: File is hidden/invisible. Skipping.
2013-09-11 15:32:47.786 hazelworker[8731] DEBUG: Skipped /Users/phantom27/Google Drive/Action /[PAUL IS CHECKING TO SEE IF INSURANCE WILL PAY] 2013-08-12 - Maintenance - Steffens Plumbing - Inv_7324_2632.pdf. File/folder not part of change set/file event.
2013-09-11 15:32:47.787 hazelworker[8731] DEBUG: Skipped /Users/phantom27/Google Drive/Action /Adobe EchoSign Text Tag Documentation.pdf. File/folder not part of change set/file event.
2013-09-11 15:32:47.888 hazelworker[8731] DEBUG: Auto OCR Files: Did not match any rules.
2013-09-11 15:32:47.891 hazelworker[8731] DEBUG: Skipped /Users/phantom27/Google Drive/Action /Fire Sprinkler Repair - M&M Fire.pdf. File/folder not part of change set/file event.
2013-09-11 15:32:47.892 hazelworker[8731] DEBUG: Skipped /Users/phantom27/Google Drive/Action /Insurance - 608 Berkley.pdf. File/folder not part of change set/file event.
2013-09-11 15:32:47.892 hazelworker[8731] DEBUG: Skipped /Users/phantom27/Google Drive/Action /Keep. File/folder not part of change set/file event.
2013-09-11 15:32:47.893 hazelworker[8731] DEBUG: Skipped /Users/phantom27/Google Drive/Action /Management Agreement - Dubay.pdf. File/folder not part of change set/file event.
2013-09-11 15:32:47.893 hazelworker[8731] DEBUG: Writing out DB file: /Users/phantom27/Google Drive/Action
2013-09-11 15:32:47.896 hazelworker[8731] DEBUG: Directory /Users/phantom27/Google Drive/Action  processed in 0.116489 seconds
2013-09-11 15:32:47.897 hazelworker[8731] DEBUG: Sending metrics to scheduler. Next scheduled run: 5828963-12-19 17:00:00.000
2013-09-11 15:32:47.898 hazelworker[8731] Done processing folder Action
2013-09-11 15:32:48.872 hazelworker[8693] Alamosa 2013 Officers.pdf: Rule OCR any Non-OCR File matched.
2013-09-11 15:32:49.296 hazelworker[8693] [File Event] File moved: Alamosa 2013 Officers.pdf moved from folder /Users/phantom27/Google Drive/Action /Auto OCR Files to folder /Users/phantom27/Google Drive/Action .
2013-09-11 15:32:49.336 hazelworker[8693] Done processing folder Auto OCR Files
2013-09-11 15:32:49.387 hazelworker[8740] DEBUG: Program is licensed.
2013-09-11 15:32:49.403 hazelworker[8740] DEBUG: No description for disk with path: /home
2013-09-11 15:32:49.403 hazelworker[8740] DEBUG: No description for disk with path: /net
2013-09-11 15:32:49.473 hazelworker[8740] DEBUG: Unexpected type for Mail download URL: (null)
2013-09-11 15:32:49.476 hazelworker[8740] DEBUG: Could not scan past browser.download.dir preference
2013-09-11 15:32:49.592 hazelworker[8740] DEBUG: Could not find entry for default_directory in Chrome preference file.
2013-09-11 15:32:49.605 hazelworker[8740] Processing folder Action
2013-09-11 15:32:49.605 hazelworker[8740] DEBUG: Initialized
2013-09-11 15:32:49.605 hazelworker[8740] DEBUG: Pausing to wait for things to settle down.
2013-09-11 15:32:50.462 hazelworker[8741] DEBUG: Program is licensed.
2013-09-11 15:32:50.475 hazelworker[8741] DEBUG: No description for disk with path: /home
2013-09-11 15:32:50.476 hazelworker[8741] DEBUG: No description for disk with path: /net
2013-09-11 15:32:50.860 hazelworker[8741] DEBUG: Unexpected type for Mail download URL: (null)
2013-09-11 15:32:50.862 hazelworker[8741] DEBUG: Could not scan past browser.download.dir preference
2013-09-11 15:32:50.971 hazelworker[8741] DEBUG: Could not find entry for default_directory in Chrome preference file.
2013-09-11 15:32:50.986 hazelworker[8741] Processing folder Auto OCR Files
2013-09-11 15:32:50.986 hazelworker[8741] DEBUG: Initialized
2013-09-11 15:32:50.987 hazelworker[8741] DEBUG: Pausing to wait for things to settle down.
2013-09-11 15:32:51.606 hazelworker[8740] DEBUG: Processing directories: (
    "/Users/phantom27/Google Drive/Action "
)
2013-09-11 15:32:51.607 hazelworker[8740] DEBUG: About to process directory /Users/phantom27/Google Drive/Action
2013-09-11 15:32:51.610 hazelworker[8740] DEBUG: Received file event: {
    date = "2013-09-11 21:32:49 +0000";
    path = "<ComNoodlesoft_NoodlePathSet: 0x7fe7e2c3f9f0> - (\n    \"/Users/phantom27/Google Drive/Action \",\n    \"/Users/phantom27/Google Drive/Action /Auto OCR Files\"\n)";
}
2013-09-11 15:32:51.613 hazelworker[8740] DEBUG: .DS_Store: File is hidden/invisible. Skipping.
2013-09-11 15:32:51.614 hazelworker[8740] DEBUG: .pdf: File is hidden/invisible. Skipping.
2013-09-11 15:32:51.617 hazelworker[8740] DEBUG: [PAUL IS CHECKING TO SEE IF INSURANCE WILL PAY] 2013-08-12 - Maintenance - Steffens Plumbing - Inv_7324_2632.pdf: Rule signature matched for rule Label - Green. Not executing actions.
2013-09-11 15:32:52.044 hazelworker[8740] DEBUG: Adobe EchoSign Text Tag Documentation.pdf: Did not match any rules.
2013-09-11 15:32:52.084 hazelworker[8740] DEBUG: Alamosa 2013 Officers.pdf: Did not match any rules.
2013-09-11 15:32:52.157 hazelworker[8740] DEBUG: Auto OCR Files: Did not match any rules.
2013-09-11 15:32:52.197 hazelworker[8740] DEBUG: Fire Sprinkler Repair - M&M Fire.pdf: Did not match any rules.
2013-09-11 15:32:52.235 hazelworker[8740] DEBUG: Insurance - 608 Berkley.pdf: Did not match any rules.
2013-09-11 15:32:52.310 hazelworker[8740] DEBUG: Keep: Did not match any rules.
2013-09-11 15:32:52.348 hazelworker[8740] DEBUG: Management Agreement - Dubay.pdf: Did not match any rules.
2013-09-11 15:32:52.352 hazelworker[8740] DEBUG: Writing out DB file: /Users/phantom27/Google Drive/Action
2013-09-11 15:32:52.374 hazelworker[8740] DEBUG: Directory /Users/phantom27/Google Drive/Action  processed in 0.767457 seconds
2013-09-11 15:32:52.376 hazelworker[8740] DEBUG: Sending metrics to scheduler. Next scheduled run: 5828963-12-19 17:00:00.000
2013-09-11 15:32:52.377 hazelworker[8740] Done processing folder Action
2013-09-11 15:32:52.987 hazelworker[8741] DEBUG: Processing directories: (
    "/Users/phantom27/Google Drive/Action /Auto OCR Files"
)
2013-09-11 15:32:52.988 hazelworker[8741] DEBUG: About to process directory /Users/phantom27/Google Drive/Action /Auto OCR Files
2013-09-11 15:32:52.992 hazelworker[8741] DEBUG: Received file event: {
    date = "2013-09-11 21:32:50 +0000";
    path = "<ComNoodlesoft_NoodlePathSet: 0x7ffb49c3a610> - (\n    \"/Users/phantom27/Google Drive/Action /Auto OCR Files\"\n)";
}
2013-09-11 15:32:52.993 hazelworker[8741] DEBUG: Writing out DB file: /Users/phantom27/Google Drive/Action /Auto OCR Files
2013-09-11 15:32:52.995 hazelworker[8741] DEBUG: Directory /Users/phantom27/Google Drive/Action /Auto OCR Files processed in 0.006308 seconds
2013-09-11 15:32:52.996 hazelworker[8741] DEBUG: Sending metrics to scheduler. Next scheduled run: 5828963-12-19 17:00:00.000
2013-09-11 15:32:52.998 hazelworker[8741] Done processing folder Auto OCR Files
Last edited by Phantom27 on Wed Sep 11, 2013 11:04 pm, edited 1 time in total.
Phantom27
 
Posts: 13
Joined: Mon Sep 09, 2013 11:08 pm

Phantom27 wrote:… this is huge …

Putting long logs, scripts, etc. inside BBCode CODE blocks is helpful. :)
sjk
 
Posts: 332
Joined: Thu Aug 02, 2007 5:43 pm
Location: Eugene

BBCode CODE block example:

[code]…stuff…[/code]
sjk
 
Posts: 332
Joined: Thu Aug 02, 2007 5:43 pm
Location: Eugene

sjk wrote:
Phantom27 wrote:… this is huge …

Putting long logs, scripts, etc. inside BBCode CODE blocks is helpful. :)


Fixed.... Thanks for the help/input.
Phantom27
 
Posts: 13
Joined: Mon Sep 09, 2013 11:08 pm

Hmm, I'm not seeing any output from the script but it does look like the file it fails on, it successfully matches later. I'm wondering if this is a case of the file still being written out when Hazel processes it. If that is the case, you might want to add a condition like "Date modified is not in the last 3 minutes" or something like that.
Mr_Noodle
Site Admin
 
Posts: 11195
Joined: Sun Sep 03, 2006 1:30 am
Location: New York City

Mr_Noodle wrote:Hmm, I'm not seeing any output from the script but it does look like the file it fails on, it successfully matches later. I'm wondering if this is a case of the file still being written out when Hazel processes it. If that is the case, you might want to add a condition like "Date modified is not in the last 3 minutes" or something like that.


That's not it. I took off everything on the hazel conditions except for it to run on any file, and do the workflow. It still causes the error.
Phantom27
 
Posts: 13
Joined: Mon Sep 09, 2013 11:08 pm

Unfortunately, unless someone else comes along, I think you'll have to become a bit more proficient with shell scripting, at least to the point where you can run and debug scripts outside of Hazel. Or, at least to the point where you can add print statements to isolate where it's going south.
Mr_Noodle
Site Admin
 
Posts: 11195
Joined: Sun Sep 03, 2006 1:30 am
Location: New York City

Mr_Noodle wrote:Unfortunately, unless someone else comes along, I think you'll have to become a bit more proficient with shell scripting, at least to the point where you can run and debug scripts outside of Hazel. Or, at least to the point where you can add print statements to isolate where it's going south.


I agree. For now I'm not going to mess with it. It wasn't that vital anyway. I did figure out a way to run the shell script using automator on an individual file, so I think I'm ok.

Thank you for your help though. I learned a lot.
Phantom27
 
Posts: 13
Joined: Mon Sep 09, 2013 11:08 pm

Hi, I know it's an old post but wondering if somebody can help: is it possible to modify the shell script so that it applies a color label if contain text and another label if it doesn't?
Something like:
#! /bin/bash
if ! grep Font "$1"
then
LABEL 1
else
LABEL 2
fi

BTW: I would use this script to move to certain folders and then pass it (still don't know how) to Devonthink Pro OCR, does somebody knows if there's a more direct way?
thx
Nestorito
Nestorito
 
Posts: 6
Joined: Thu Sep 26, 2013 10:31 am

Hi Mr_Noodle and hi all !

I use Hazel recently and I am also interested by the answer to the question in the previous post by Nestorito (Thu Sep 26, 2013 11:07 am)
Someone has an idea?

Thanks in advance.
Wonderful Forum !!!
Loyd
 
Posts: 9
Joined: Fri Nov 15, 2013 5:42 pm

nobody for answering my question?
Loyd
 
Posts: 9
Joined: Fri Nov 15, 2013 5:42 pm

Bump !
Loyd
 
Posts: 9
Joined: Fri Nov 15, 2013 5:42 pm

PreviousNext

Return to Tips & Tricks - DO NOT POST QUESTIONS