Page 1 of 1

Rename with 'Contents Match' often gets year wrong

PostPosted: Sun Aug 14, 2016 5:51 pm
by jbrwilkinson
We have a number of rules to match just-scanned documents which have been OCR'd by Abbyy. For our receipts, there are many different variations of date format, e.g. DD/MM/YY, D/MM/YYYY, etc.

We use Contents Match rule with a Custom Date which matches one of the patterns, and then put a "Rename ..with the pattern" action on it to prefix the filename with the date discovered inside the file. Often, the date prefixed will end up as '2020' even though it should be 2016. It's inconsistent, but I have some PDF files that reliably reproduce this.

Seems like a bug - anyone else seen this?

Re: Rename with 'Contents Match' often gets year wrong

PostPosted: Mon Aug 15, 2016 11:14 am
by Mr_Noodle
I suspect you are matching a two digit year when the year is four digits, thus only grabbing the first two digits ("20"). Use the preview function as you should see exactly what it is capturing there.

Re: Rename with 'Contents Match' often gets year wrong

PostPosted: Mon Aug 29, 2016 2:42 am
by Rafterkai
Hello,

I'm having the same issue.
Trying to scan for multiple date formats and rename to the correct date.
Is there a workaround for when the year in the custom date is set to search for 2 digits?
I would like it to ignore dates it finds with 4 digits instead of returning the first 2 from the 4 digits.

Thanx

Re: Rename with 'Contents Match' often gets year wrong

PostPosted: Mon Aug 29, 2016 1:59 pm
by Mr_Noodle
In the pattern, click on the year token. You should see an option for a four digit year.

Re: Rename with 'Contents Match' often gets year wrong

PostPosted: Mon Aug 29, 2016 3:58 pm
by Rafterkai
Mr_Noodle wrote:In the pattern, click on the year token. You should see an option for a four digit year.


The issue is that i am searching content that may have both a 4 and 2 digit year in the date and rename accordingly so i'm using two separate content search fields with both year tokens in attribute.

Anyway, i just found a workaround.
In the attribute field with the 2 digit year token add a space after the token this way it will ignore the 4 digit year found in the content!

Cheers

Re: Rename with 'Contents Match' often gets year wrong

PostPosted: Tue Aug 30, 2016 12:13 pm
by Mr_Noodle
You can have two different patterns for the same date attribute. Have the first one search for the 4 digit one and if that fails, match the 2 digit one.

Re: Rename with 'Contents Match' often gets year wrong

PostPosted: Tue Aug 30, 2016 1:16 pm
by Rafterkai
Mr_Noodle wrote:You can have two different patterns for the same date attribute.


I'm assuming you mean using a second rule with different attribute names. Or is there a way to search multiple patterns using same attribute name? I tried using the same name but it was automatically renamed.

Using different patterns with different attribute names works great when content has 2 digits but when the content has 4 digits, it finds both rules as true.

For renaming i have used one rule with all attribute date patterns so it renames to the attribute it finds as a match. When more than 1 attributes are matched, it adds every match to the renamed file.
My solution was to add a space after the date attribute with 2 digits, this way it doesn't match the rule.

Here is a screenshot of the way i've structured my rules.
Any comment is appreciated.

Note: The second rule has a space following the attribute. Unless I use the space, it will return the first 2 rules as true. (the preview file has a four digit date)

Image

Re: Rename with 'Contents Match' often gets year wrong

PostPosted: Wed Aug 31, 2016 2:17 am
by Rafterkai
Mr_Noodle wrote:You can have two different patterns for the same date attribute.


Ok Just got it...

I didn't realise that i could edit the custom date attribute from separate rules with the same name without affecting the original attribute.

Now, i've used the same custom date in all rules and changed the attributes in each rule, then, in renaming i used only the one custom date. (the only problem is that you have to click on each rule separately to see what pattern you have set in the attribute which is not too convenient when you have many different patterns)

I still have to add a space after the 2 digit attribute for it not to return 4 digit content as true.

Re: Rename with 'Contents Match' often gets year wrong

PostPosted: Sun Dec 11, 2016 3:15 pm
by jbrwilkinson
Mr_Noodle wrote:I suspect you are matching a two digit year when the year is four digits, thus only grabbing the first two digits ("20"). Use the preview function as you should see exactly what it is capturing there.


This was exactly the problem - thanks for your help!

Re: Rename with 'Contents Match' often gets year wrong

PostPosted: Mon Jun 18, 2018 12:25 pm
by mmandell
I was having the same issue and used the auto date option and it made a big speed improvement for making new rules.