Extracting an Insurance Claim Number

Talk, speculate, discuss, pontificate. As long as it pertains to Hazel.

Moderators: Mr_Noodle, Moderators

Extracting an Insurance Claim Number Fri Feb 10, 2023 4:40 am • by mascotca
I would like to name a PDF of an Insurance Explanation of Benefits not only by Date but by Claim Number.

Can anyone tell me if it's possible to create an Action that would extract not only the Date but would also then follow that by a Claim # OCRed from the document as well, in that order. (At the end I put text identifying the Insurer as well as the filename extension.)

I'm looking for a result that looks like this:

2023-02-10 480596316 CIGNA EOB.pdf

I don't know if there is an existing action that can do this or if this is something that scripting could do and I have no idea how to write scripts for myself.
mascotca
 
Posts: 5
Joined: Tue Feb 07, 2023 5:35 am

Re: Extracting an Insurance Claim Number Fri Feb 10, 2023 10:36 am • by Mr_Noodle
Look up match patterns in the help. You'll need to create custom attributes to match the parts of the file contents you need.
Mr_Noodle
Site Admin
 
Posts: 11250
Joined: Sun Sep 03, 2006 1:30 am
Location: New York City

Re: Extracting an Insurance Claim Number Mon Feb 13, 2023 9:42 pm • by mascotca
Thanks!

I was able to get further in the User Guide and I thought I had figured out how to create this simple custom attribute.

Unfortunately the PDFs in question are bizarrely formatted in a way that Hazel's OCR can't see. (Cigna prints their Claim number with a black Font ON A DARK GREY BACKGROUND. Even though Nitro and Preview and PDF Expert can all find a search for the number, Hazel skips right over to other number combinations (it's just a plain 9-digit number) later in the document.

Can't for the life of me figure out why Cigna or anyone else would format their stuff to be illegible not only to OCR but to the naked eye! But that's hardly Hazel's fault.

I have set up a deep dive into learning Hazel using the Help guide, ScreenCastsOnline's tutorials and the MacSparky Guide for H.

Again, thanks for the pointer. Hopefully after proper training I can spare you any more basic questions.
mascotca
 
Posts: 5
Joined: Tue Feb 07, 2023 5:35 am

Re: Extracting an Insurance Claim Number Tue Feb 14, 2023 10:18 am • by Mr_Noodle
Hazel uses the same PDF engine as Preview, so opening in Preview is a good way to double-check things. One thing to try is to try and print the file, but instead, do Save as PDF to see if it generates a PDF that Preview/Hazel can read.
Mr_Noodle
Site Admin
 
Posts: 11250
Joined: Sun Sep 03, 2006 1:30 am
Location: New York City

Re: Extracting an Insurance Claim Number Tue Feb 14, 2023 10:50 am • by mascotca
I tried your suggestion but the result is the same.

As I mentioned before (I think) all my PDF reader apps (including Preview) can be used to search/find the actual Claim #. As I type in the digits they zero in on the right text in the PDF and highlight them in yellow and add them to the list of found search items, expanding the highlight as I keep typing one by one.

So there must be some kind of functional OCR in all of those apps when they use this document downloaded from Cigna. I've even used Nitro to re-OCR the document in question and examined the "OCR-layer" before and after (if you're familiar with how that app works.) That newly created layer in Nitro never sees the Black on Gray Claim # either; and also omits or garbles the Black on Blue info.

For the record, the next string of digits that fits the condition are immediately to the right (i.e. the next occurrence) which happens to be part of a larger "Patient Acct#" that uses Bolded Black on BLUE formatting. It is that that gets inserted into my file name by Hazel.

So, again not sure what is going on exactly.

If you want or need it perhaps I could send you the document in question?
mascotca
 
Posts: 5
Joined: Tue Feb 07, 2023 5:35 am

Re: Extracting an Insurance Claim Number Wed Feb 15, 2023 10:45 am • by Mr_Noodle
If you can send one without personal data in it, that would be great.
Mr_Noodle
Site Admin
 
Posts: 11250
Joined: Sun Sep 03, 2006 1:30 am
Location: New York City

Re: Extracting an Insurance Claim Number Thu Feb 16, 2023 8:40 am • by mascotca
Thanks to Mr. Noodle I found a solution to my issue. He pointed out that it might not be the formatting of the PDF in question.

When defining the order number in a Custom Attribute: some PDFs number them un-intuitively in a different order from their actual appearance!

I finally got the "Claim #" I wanted (into my filename) which appears in the document as the FIRST of three strings ("from the beginning") that fit my custom attribute's defined pattern (of a 9 digit number.) I had to number it in my Custom Attribute as the THIRD occurrence!

So, if you run into this problem try experimenting with a different occurrence number until it works!
mascotca
 
Posts: 5
Joined: Tue Feb 07, 2023 5:35 am


Return to Open Discussion