Noodlesoft Forums

Posted: **Sun Dec 27, 2020 9:42 pm**

Hi all --

I have been scanning a large number of paper financial statements that I'd like Hazel to rename including the statement date or date range. The PDFs have been OCRed but it is an imperfect process, leaving me with documents with dates that don't quite match up cleanly with a simple date pattern. The greatest error culprit by far is that the OCR misses spaces, giving text like:

January 1, 2007 - January 31, 2007
January 1,2007 - January 31, 2007
January 1, 2007 - January 31,2007
January 1, 2007 -January 31, 2007
January 1,2007- January 31, 2007
January 1,2007-January 31,2007

I know that Hazel's space-matching will match any number of spaces. However it looks like I need the equivalent of the regex ? operator, i.e. "match zero-or-one spaces". Is there a way to do this in Hazel? If not, can someone recommend a way to handle this parsing situation without painfully enumerating all possible spacing configurations as separate rules?

Thanks in advance!

Posted: **Mon Dec 28, 2020 10:13 am**

No good way to do that now. Does the auto detection work in this case? Might be worth trying.

Posted: **Thu Dec 31, 2020 10:22 pm**

Unfortunately, no. I appreciate how the Hazel UI simplifies pattern matching, but is there a way to actually plug in a real regular expression? I can write one that resolve the issue, but it would need a full regex syntax.

Thanks,

Ramon

Posted: **Mon Jan 04, 2021 12:22 pm**

Use the "Run shell script" action. You can then use regexes in the language of your choice.

Posted: **Thu Oct 12, 2023 4:42 am**

I think I have a similar problem. My customer number was suddenly changed on my phone provider's bill. A space was simply inserted in a different place. Otherwise, the number has remained the same.
Is there a rule in hazel that can flexibly work around such spaces?
For me it would also be ok to enter a regex. but that doesn't seem to be supported.

Posted: **Thu Oct 12, 2023 9:00 am**

If you want to use regex, you can use a shellscript action and the language/regex dialect of your choice.

Noodlesoft Forums

Flexible date matching (optional spaces) for scanned docs?

Flexible date matching (optional spaces) for scanned docs?

Re: Flexible date matching (optional spaces) for scanned doc

Re: Flexible date matching (optional spaces) for scanned doc

Re: Flexible date matching (optional spaces) for scanned doc

Re: Flexible date matching (optional spaces) for scanned doc

Re: Flexible date matching (optional spaces) for scanned doc