Fuzzy / less strict matching on poorly-formed PDFs

Hi everyone,
I'm trying to use Hazel to batch process ~6000 PDFs into folders. These PDFs were created from email newsletters (using a clunky batch processing tool in Windows).
The main things I need to do are:
- Sort emails into a subfolder based on topic e.g. 'Engineering and Maintenance', 'ABC Project Update'
- Rename the PDF using a date match
The problem is that the text in the PDFs is ill-formatted probably because of the creation tool, and Hazel is having trouble matching patterns. For example, if i use the condition:
It doesn't work because the characters in the PDF are actually:
The same problem is messing up date matches: Hazel thinks an email was sent in January 2020, for example, because the PDF contents has the first date as "January 20 23". I haven't tested on many documents but I suspect the formatting patterns are unpredictable.
Is there a way to do a fuzzier match in Hazel, something like "match this string, even if there are spaces between some of the characters?"
Thanks in advance for your help.
I'm trying to use Hazel to batch process ~6000 PDFs into folders. These PDFs were created from email newsletters (using a clunky batch processing tool in Windows).
The main things I need to do are:
- Sort emails into a subfolder based on topic e.g. 'Engineering and Maintenance', 'ABC Project Update'
- Rename the PDF using a date match
The problem is that the text in the PDFs is ill-formatted probably because of the creation tool, and Hazel is having trouble matching patterns. For example, if i use the condition:
- Code: Select all
Contents contain "This message has been sent to everyone in Engineering and Maintenance"
It doesn't work because the characters in the PDF are actually:
- Code: Select all
This m essage has been sent to everyone in Engineering and M aintenance.
The same problem is messing up date matches: Hazel thinks an email was sent in January 2020, for example, because the PDF contents has the first date as "January 20 23". I haven't tested on many documents but I suspect the formatting patterns are unpredictable.
Is there a way to do a fuzzier match in Hazel, something like "match this string, even if there are spaces between some of the characters?"
Thanks in advance for your help.