Page 1 of 1

Differences between kMDItemTextContent and Hazel Importer

PostPosted: Sun Oct 19, 2014 5:05 pm
by drmcdona
I know this has come up before, but I am trying to get a date out of a mortgage statement and the rule is failing.

I ran the hazelImporter utility and it discovered 3089 characters of OCR text which seems to be missing the top half of the document. I then ran the same document through mdimport and the kMDItemTextContent variable contained 7745 characters and included what seems to be the entire document.

Before I head off and start writing scripts, does this sound like something I should expect? Is there anything I could have done in the scanning process to cause this? Since I am only using the OCR'ed file in this test, I'm not sure this is something I caused, but the results seem surprising.

Did I miss something ?

Thanks

Re: Differences between kMDItemTextContent and Hazel Importe

PostPosted: Mon Oct 20, 2014 10:22 am
by Mr_Noodle
Hard to say. hazelimporter uses a different mechanism than kMDItemTextContent. Do you have a non-sensitive document that exhibits this problem that you can email in to support?