Page 1 of 1

An interesting tool for checking PDF text content

PostPosted: Wed Nov 06, 2019 6:48 am
by chazzo
I've just discovered a free PDF viewer that I find useful for checking the text content of PDFs when creating Hazel rules to match text. It's named Podofyllin and it was created as a PDF debugging tool by Howard Oakley of the Eclectic Light Company.

When trying to track down that elusive "fourth date from the end" I often copy PDF content and paste it into a text editor. The alternative of using Hazel's Preview feature and clicking the "Rule matches" icon is welcome but can be a bit clunky, not least because the resulting pop-up window is rather small.

Opening a PDF in Podofyllin yields an instant preview of the text content alongside the rendered view. There's also a "View Source" option to show either the raw Postscript-style code or a "flattened" version that makes a bit more sense to non-experts like me.

I discovered this when trying to find out why PDF statements downloaded from a well-known UK bank are not searchable by Spotlight. On my Mac the text content is visible in Acrobat, but shows up as blank in any tool based on PDFKit, which includes Spotlight, DEVONthink and of course Hazel. Podofyllin doesn't fix the problem but it may be handy for anyone facing PDFs that are badly made or have an unusual structure.

Re: An interesting tool for checking PDF text content

PostPosted: Wed Nov 06, 2019 10:25 am
by Mr_Noodle
Thanks for the tip. BTW, Hazel's preview window is resizable, unless you are on Catalina, which broke it.

Re: An interesting tool for checking PDF text content

PostPosted: Wed Nov 06, 2019 11:46 am
by chazzo
Mr_Noodle wrote:Thanks for the tip. BTW, Hazel's preview window is resizable, unless you are on Catalina, which broke it.


Well, golly gee: not only is it resizable, but once detached it stays open while you fine-tune your rules. I'm sorry for not reading the fine manual.

That pretty much obviates the need for other tools, except possibly if you want to view invisibles (I was reduced to counting spaces the other day -- not a very robust solution, I fear).