Page 1 of 1

OCR via ocrmypdf

PostPosted: Sat Mar 21, 2020 9:57 am
by frid
Hi,

I am trying to use ocrmypdf to do OCR on my pdf using "Run Shell script (embedded script)". Whenever I do run the hazel rule I get an error message "shell script failed" and the rules stops working. When I check the file, the OCR has happened anyhow but no command after the shell script will be executed.

I am using this command in the shell script: "ocrmypdf $1 $1", which works fine, if I execute it in bash shell (Well, I do not use $1 but the actual name of the file I want do do the OCR on").

I would be very grateful, if anybody could help me. If ocrmypdf is not the solution, is there any way to do OCR on pdf files without buying an expensive piece of software.

Thanks
Frank

Re: OCR via ocrmypdf

PostPosted: Mon Mar 23, 2020 11:14 am
by Mr_Noodle
Try using double quotes around each $1 ("$1"). If that doesn't fix it, then turn on debug mode as described here: https://www.noodlesoft.com/kb/hazel-debug-mode/

After that, check the logs and look for any errors relating to the script.

Re: OCR via ocrmypdf

PostPosted: Fri Apr 03, 2020 11:57 pm
by gvantass
frid wrote:Hi,

I am using this command in the shell script: "ocrmypdf $1 $1", which works fine, if I execute it in bash shell (Well, I do not use $1 but the actual name of the file I want do do the OCR on").

I would be very grateful, if anybody could help me. If ocrmypdf is not the solution, is there any way to do OCR on pdf files without buying an expensive piece of software.


Hi, Frank. I've actually been playing with OCRmyPDF for about the last hour, having started by way of Zotero > Zotero-OCR > Tesseract and you might actually be better served by using the "watcher.py" addon for OCRmyPDF. Here is the direct link to that part of the documentation: https://ocrmypdf.readthedocs.io/en/late ... ed-folders

I'm thinking that I could use Hazel to sort or move the PDF files to the watched folder, then let watcher process them and deliver them to the desired output folder.