Page 1 of 1

Date Matching Receipts

PostPosted: Sun Apr 02, 2017 9:06 pm
by jmottle
Hi,

I'm trying to use date match from OCR on scanned business receipts to rename the PDF file. It seems there are at least several dozen different date format possibilities, so I have set up a number of rules to find some of the possible formats. The problem is there is no way to know which rule matched and therefore no way to rename the file with the proper token. A lesser problem is also the fact there is no way to know the difference between DD/MM/YY or MM/DD/YY. Or similar.

Image

I had considered even manually re-writing the date on the receipt again so the format was always consistent, but it seems Abbyy FineReader can not reliably read even very neat all caps printing. (I used to be a draftsman in a former life).

In an ideal world I could write on the receipt: the date, the expense type and store name and have all of that rewritten into the filename. Any suggestions here?

My last resort, and far from ideal, is to manually save the receipt into named folders based on date, expense type and use those more easily captured variables to rename.

Cheers,
Jeff

Re: Date Matching Receipts

PostPosted: Mon Apr 03, 2017 10:18 am
by Mr_Noodle
Instead of using separate date attributes for each case, re-use the same one, specifying a different format each time. After creating the first one, it will be available in the list so you can drag that in in subsequent conditions.

As for DD/MM/YY vs MM/DD/YY, you need to decide on context which one is more likely. Something like 01/09/17 is always going to be ambiguous so you'll need to provide hints as to which one it is. Note that the first one to match is the one that gets used so order your conditions appropriately.

You can also try experimenting with other OCR programs to see if they fare better. PDFpen is common among users here.

Re: Date Matching Receipts

PostPosted: Mon Apr 03, 2017 12:37 pm
by jmottle
Thanks, that worked now, however in running tests it would appear this is not going to work in the end. There is no way to know which date format is going to hit and which one is correct without a manual review. I travel all over the world and every country and vendor uses completely different formats. Some countries use multiple formats. No one is consistent or obvious.

Re: Date Matching Receipts

PostPosted: Thu Apr 06, 2017 3:50 pm
by printerless
Mr_Noodle wrote:Instead of using separate date attributes for each case, re-use the same one, specifying a different format each time. After creating the first one, it will be available in the list so you can drag that in in subsequent conditions.

Sound good.
But how to re-use?
Can you show a short example how to setup?
The problem is to use ONE variable in DO part, when one of the date conditions are detected in the IF part.

Re: Date Matching Receipts

PostPosted: Fri Apr 07, 2017 10:56 am
by Mr_Noodle
In the second instance, when you edit the pattern, the attribute you created earlier will appear at the bottom. Drag that in to use it again.

Re: Date Matching Receipts

PostPosted: Mon Apr 10, 2017 2:53 pm
by printerless
Great!
Considering your hint and knowing the point where to look to, it's easy.
Works like a charm.

Regards from Germany.
Printerless.

Re: Date Matching Receipts

PostPosted: Thu Apr 27, 2017 12:20 pm
by ulmer42435
Mr_Noodle wrote:Instead of using separate date attributes for each case, re-use the same one, specifying a different format each time. After creating the first one, it will be available in the list so you can drag that in in subsequent conditions.

As for DD/MM/YY vs MM/DD/YY, you need to decide on context which one is more likely. Something like 01/09/17 is always going to be ambiguous so you'll need to provide hints as to which one it is. Note that the first one to match is the one that gets used so order your conditions appropriately.

You can also try experimenting with other OCR programs to see if they fare better. PDFpen is common among users here.


It is not at all obvious that changing the format for the open doesn't change it everywhere it's used... Knowing this tidbit just made what I was trying to do possible! Thank you!

Node that changing the *name* of the token seems to have unintended effects. I tried to change the name (since it now represents something more generic) and it looked like it worked (the name changed everywhere), but when I re-visited the rule I then had two tokens, one with each name. When I tried to delete the "new" (renamed) token, Hazel warned me and then deleted the other one.

Re: Date Matching Receipts

PostPosted: Sun Apr 30, 2017 4:53 pm
by speedy_99
Mr_Noodle wrote:In the second instance, when you edit the pattern, the attribute you created earlier will appear at the bottom. Drag that in to use it again.


Hi,
if I use a token created earlier from the bottom and if I check by "Edit Attribute", the line below "Name:" is empty.
Is this correct or should it contain the same content of the first instance

https://www.dropbox.com/s/g8xqybaxfbbymhc/1_Instance.PNG?dl=0
https://www.dropbox.com/s/qjp2kn5833l1esq/2_Instance.PNG?dl=0

How could I post IMG? [img] ist ON

Re: Date Matching Receipts

PostPosted: Mon May 01, 2017 10:22 am
by Mr_Noodle
The whole point here is to match multiple formats so in the second case, enter the alternate format you want to match. There's no reason to use the same pattern from the earlier one since the first one is already trying to match that.

Re: Date Matching Receipts

PostPosted: Mon May 01, 2017 12:26 pm
by speedy_99
Mr_Noodle wrote:The whole point here is to match multiple formats so in the second case, enter the alternate format you want to match. There's no reason to use the same pattern from the earlier one since the first one is already trying to match that.


Ok, I understand, the same token is not unique in the same rule.

So I have to write the same pattern again, although I want to use the same format? That would make a lot of work.

Is it correct, that I only could see 3 tokens at the bottom although thre are more?

Re: Date Matching Receipts

PostPosted: Tue May 02, 2017 10:54 am
by Mr_Noodle
There's a bug about only seeing three attributes at the bottom.

As for the same pattern, can you describe why you want to use the same pattern again?

Re: Date Matching Receipts

PostPosted: Tue May 02, 2017 12:17 pm
by speedy_99
Mr_Noodle wrote:As for the same pattern, can you describe why you want to use the same pattern again?


There are a lot of docs in my office with medical content.
Special complex keywords combinations are token1, other combinations are token2 ...

In the next step there are combinations of this tokens.

Condition "any"
token1 with token2
token1 with token3
token2 with token3 etc.


It's similar to date combinations of jmottle (at the top of this postings)
You could create a token YYYY, a token MM, a token DD and combine it in a lot of variations.
YYYY-MM-DD, YYYY-DD-MM ...

A complex token, has to be formated only once.
To reuse a token would save a lot of time.

Re: Date Matching Receipts

PostPosted: Tue May 02, 2017 1:58 pm
by Mr_Noodle
Note that a new feature in 4.1 (currently in beta) may obviate the need for this, at least for the cases outlined here. No release date yet but soon-ish, hopefully.

Re: Date Matching Receipts

PostPosted: Wed May 03, 2017 7:40 am
by speedy_99
Mr_Noodle wrote:Note that a new feature in 4.1 (currently in beta) may obviate the need for this, at least for the cases outlined here. No release date yet but soon-ish, hopefully.



thx, great.

Your software and your support in this forum, was the main reason to buy a Mac for my office.
The rest of my software is Windows, because there is no Macsoftware for my profession.