PDF contents

Get help. Get answers. Let others lend you a hand.

Moderator: Mr_Noodle

PDF contents Sun Jan 10, 2021 9:08 pm • by themeister
I have used a rule to extract name, date, and ID# from PDFs for over 6 months (about 75 instances) without issue. Now a particular pdf is 'read' differently by Hazel so the extracted material is wrong. The pdf looks like all the others when displayed in Apple Preview. Is Hazel 'reading' this file differently or is the pdf constructed differently.
themeister
 
Posts: 3
Joined: Wed Feb 19, 2020 4:09 am

Re: PDF contents Mon Jan 11, 2021 11:33 am • by Mr_Noodle
If viewing in Preview, can you search the text in the same way as the other files?
Mr_Noodle
Site Admin
 
Posts: 11868
Joined: Sun Sep 03, 2006 1:30 am
Location: New York City

Re: PDF contents Sat Jan 16, 2021 4:23 am • by themeister
While I never searched for elements manually before, searching strings did work on the problematic files in Preview.

One extraction element is to grab the 2nd occurrence of a date (the date of the exam), however it mistakingly grabs the first that is a birthdate

Another extraction is a name that comes on a line before the string 'MRN", however it incorrectly grabs the line above the name

A third extraction is capturing a number following that same MRN occurrence - that does work correctly
themeister
 
Posts: 3
Joined: Wed Feb 19, 2020 4:09 am

Re: PDF contents Mon Jan 18, 2021 11:24 am • by Mr_Noodle
Use Hazel's preview. There you can view the text as Hazel sees it. The problem with PDFs is that text may not actually be stored in the same order as how you see it.
Mr_Noodle
Site Admin
 
Posts: 11868
Joined: Sun Sep 03, 2006 1:30 am
Location: New York City


Return to Support