Noodlesoft Forums

Posted: **Tue Nov 09, 2021 6:32 pm**

Hi there,
my Fujitsu ScanSnap tries to generate accurate filenames from a pdf's contents. However, this often times leeds to gibberish filenames like this:

Mainzer_Айев_17.19 -> should be Mainzer_Allee
aNلمηοη_ -> should be Union
2021-11-05_سم__ -> whatever this should be, it tried to OCR a drawing of my 3 year old daughter

I would very much like Hazel to remove these non-latin characters.
Is there any way to match them?

Thanks a bunch!
Chris

Posted: **Wed Nov 10, 2021 2:47 pm**

No good way at the moment. Probably the best way would be a shell script using a scripting language with regular expression support. Not sure if any of this is making sense to you or not.

Noodlesoft Forums

Remove non-latin characters

Remove non-latin characters

Re: Remove non-latin characters