Issues with Hazel and PDFPen

I am currently evaluating ways to automate OCR for PDFs and I'm running into some quirks that I could use some help with. Currently I'm basing my workflow off the following link: http://katiefloyd.com/blog/automaticall ... and-pdfpen

My rules are quite simple:

Extension is PDF
Size is greater than 10KB (to skip over corrupt PDFs)
Tags does not contain "ocr"

If all match, we do the following:

Run Apple Script (which is in above link)
Add tags "ocr"

Currently the workflow is working great, immediately going to work when I drop about 20 files in. The problem I'm running into is it will skip certain PDFs that it shouldn't, giving me an Hazel error saying it cannot execute the Applescript. What's strange is when it finishes going through the 20 files or so I'm testing with, I can rerun rules and Hazel will pick up the remaining items it skipped over. My last run I had to run Hazel manually 2 more times for it to finish the folder. At no time did it start back up again and start running rules on skipped files. I went and grabbed the log from Hazel and below is what it is saying:

2016-11-11 11:53:26.905 hazelworker[82849] OSAScript error: {
NSLocalizedDescription = "PDFpen got an error: Connection is invalid.";
NSLocalizedFailureReason = "Connection is invalid.";
OSAScriptErrorAppAddressKey = "<NSAppleEventDescriptor: [0x0,105c05b \"PDFpen\"]>";
OSAScriptErrorAppNameKey = PDFpen;
OSAScriptErrorBriefMessageKey = "Connection is invalid.";
OSAScriptErrorMessageKey = "PDFpen got an error: Connection is invalid.";
OSAScriptErrorNumberKey = "-609";
OSAScriptErrorRangeKey = "NSRange: {0, 0}";

I'm not really sure what to make of it, and it's likely not a Hazel problem but an issue with PDFPen or the AppleScript. The only thing that comes to mind is PDFPen is working too fast or data is coming into fast. Anyone have any ideas of where I should look?

After doing some more research, since the script works when I rerun the rules, I'm going to shift my focus away from why the script errors out, and learn more about Hazel.

If Hazel skips a file due to an AppleScript error, but still meets the criteria, what would need to happen for Hazel to run again and pickup any missing files? As I said earlier, if I manually rerun rules, it finishes up the last few files and we're good to go. Since Hazel seems to run when there are files that meet the criteria, what exactly do you do when it doesn't run? Does it need 5 or 10 minutes to realize it needs to run again?

It will recheck them after an interval. That said, I think the original error is the result of an issue with PDFpen. You might want to get in touch with them.

I know it's been roughly a year since I posted this, but this project got shelved and recently dusted it back off for the client. Everything is working, but Hazel seems to take a long time before it circles back around to grab missing files. If a new file gets added, it will kick into gear, but I'm unsure how long it takes to kick back in. I added a bunch of test files yesterday, it ran through the folder but didn't get all of them. Left it alone and checked 24 hours later, still no movement. If I manually run rules over and over, it eventually grabs all of them and completes the tasks.

With that said, is there a way to have Hazel check on the folder more often, regardless if a file has been added to the watched folder?

Can you email in the logs showing this?

Sorry for the delays. I restarted the process with a new batch of PDFs and same things happening. Runs through the folder, processing a good chunk of PDFs but skipping some, then never circles back around to finish the others unless there is movements or changes in that folder. If I run Hazel manually over and over, it eventually finishes the documents.

Let me know what you find out or suggest.

https://www.dropbox.com/s/rfvfw0ypbyiem ... g.rtf?dl=0

I see a couple of issues. One is that many of the PDFs are busy. Secondly, there's an error with the AppleScript which seems to be coming from the PDFPen side. You might want to consult their support site to see if it's the type of thing they've seen before.

Mr_Noodle wrote:I see a couple of issues. One is that many of the PDFs are busy. Secondly, there's an error with the AppleScript which seems to be coming from the PDFPen side. You might want to consult their support site to see if it's the type of thing they've seen before.

I understand all of that, but the script works just fine as long as Hazel periodically checks in and runs again. Is there a way to have Hazel check the folder periodically instead of only when there are changes the folder. That's going to be much easier for me to implement than tweaking a script I didn't write.

I'm not sure how Hazel checking periodically and doing nothing will affect PDFPen in this case. Also, are you running the updated script (there's a link at the top of the article you linked)?

Mr_Noodle wrote:I'm not sure how Hazel checking periodically and doing nothing will affect PDFPen in this case. Also, are you running the updated script (there's a link at the top of the article you linked)?

I am running the latest versions of the script, as I rebuilt this from the ground up before posting again here.

I don't see why Hazel periodically checking wouldn't help the issue. If I go in and manually run Hazel over and over, it eventually executes on all PDFs. I'm not sure why it skips over some PDFs, but all I know is it isn't a permanent skip and will usually execute on the next run.

I know it's a weird issue but I'm trying to take the path of least resistance, and having Hazel running over and over until it completes all the PDFs seems to get the job done, I just need to automate it versus manually doing it.

The point is that the reason Hazel "skips" files is because PDFPen is failing. It's probably somewhat random so while running Hazel more frequently may end up successfully running on those files, the base problem of PDFPen not working is not addressed. Fixing that issue will save you a lot of time and effort having to run things manually to make up for it.

I will track down the original script writer and inquire to see what I can find out. In the event that I hit a dead end with that, what are my other options? I've seen you mention a hidden feature in v4 that checks in on a folder periodically, but not use it without discussing with you first. Would that feature be an option for me?

I would check with the developers of PDFPen, not the script writer, in this case.

You can enable the periodic checking but note that that will incur more processing. Again, I think it's better to fix the root cause.

Sounds good, I have already started my conversation with PDFPen, I'll report back with my findings.

bayon wrote:Sounds good, I have already started my conversation with PDFPen, I'll report back with my findings.

did you ever figure out the problem. i'm trying to accomplish the same thing but using another OCR program. I would change to PDFPen if you have it solved. Thanks.