Contains Match Occurrence Sequence

I am using a custom match to pull my credit card balance from an OCR'd pdf into the file name using the format '£(number)' and matching on the 4th occurrence (prev balance, payments rec, new activity amounts come first on the page - top-to-bottom).

However, Hazel appears to be updating the file name with the 2nd value on the page, not the 4th? I have manually performed a search for '£' on the pdf and there are no matches higher on the 1st page but there are other matches further down the page, and on subsequent pages.

Does anyone know in what order pages are OCR's - I had assumed top to bottom and left to right but I may be wrong on that. The credit card statement has 3 columns with text in multiple sizes so defining an order for the text may be a much more complex subject than I have assumed.

(btw, I have fixed this problem by including the text 'new balance' in the custom match token and then replacing that text when adding to the file name - nevertheless I am still interested in understanding what I am doing wrong).

Unfortunately, the ordering of things in PDF files can be hard to predict. If using OCR, it's up to the OCR program to write out the text in whatever order and sometimes it doesn't quite mesh with what you see visually. This is especially true when there are multiple columns or sections instead of a single stream of text. If you are really curious, reply back and I can give more detailed instructions (requiring use of Terminal) where you can dump the raw text of the file.