I've fiddled around with this some and confirmed that I cannot see the page text in the document in question while I can in others basically using the following simple Swift code (note: I'm a Java developer, not familiar with Swift):
- Code: Select all
let fileName = NSString(string: "~/Downloads/<some file>.pdf").expandingTildeInPath
let fileUrl = URL(fileURLWithPath: fileName)
let pdfDocument = PDFDocument(url: fileUrl)
let page1 = pdfDocument?.page(at: 1)
print(page1?.string)
The interesting thing I discovered is that parts of the document in question are missing in Preview, but can be seen in Adobe Reader. This particular problem has been encountered by others:
https://discussions.apple.com/thread/3304322I'm not sure a way around this, nor do I have a reason why NONE of the text is available through PDFKit, but some of it can be seen by Preview. I have confirmed that the text that appears "hidden" with Preview cannot be found with a "contains" rule, but text that is not "hidden" can be found.
I'm not sure that helps. Wish I could share the document causing the problem, but it has too much PII. I'll continue to look for a solution on my end, but until then these documents cannot take advantage of hazel goodness.
