Page 1 of 1

Occurrence From The End ISSUE

PostPosted: Wed Sep 07, 2016 8:08 am
by benhazel
Hi All,

Hopefully you can help! I have set up a custom text token to read the last invoice total amount [from an OCR'd PDF] using the following contain match:

£[number].[number] 1st Occurrence from End.

However the majority of the time it picks the second from last Aaagh! which is the TAX due

The amount is the only figure in bold if that helps!

I then use this token in addition to others to rename the doc [INVOICE-(date)-(job number)-(total invoice).pdf for other purposes.

Re: Occurrence From The End ISSUE

PostPosted: Wed Sep 07, 2016 12:11 pm
by Mr_Noodle
Use the preview function, where you can also see the text as Hazel sees it. Chances are, the text is not in the order you think it is.

Re: Occurrence From The End ISSUE

PostPosted: Wed Sep 07, 2016 12:19 pm
by benhazel
Thanks for your quick response how do you view the text of the document with preview? When I use the preview function it shows me that the rule matches without any visual.

Re: Occurrence From The End ISSUE

PostPosted: Wed Sep 07, 2016 12:36 pm
by benhazel
Is there any way for the contents match to look at BOLD text. I think this would solve the issue I have?

Re: Occurrence From The End ISSUE

PostPosted: Thu Sep 08, 2016 10:27 am
by Mr_Noodle
When a PDF is OCRed, the OCR process doesn't care about the text formatting. Even if it is a PDF generated with text from the get-go, Hazel doesn't see the formatting instructions. You'll have to rely on the raw text.