Well here I am replying to my own post. Thank you
luomat for your suggestions.
I had some time over the holidays and I've developed a solution that works for me.
The goal is to automatically rename PDF files downloaded to the Downloads folder with a prepended reverse date, "2020-01-01--" extracted from the metadata "Creation Date" seen in the PDF inspector. (This is NOT the same as the Spotlight metadata).
Background: My Hazel rules all process files that contain a prepended date. The Hazel rules use this prepended date to change the
file creation date via a shell script, as files are imported by Hazel as part of my paperless workflow, into EagleFiler paperless library.
My manual process has been to scan and OCR, and then manually open each file, review the date, and prepend the filename with that date in that specific format. Most of my documents don't lend themselves to letting Hazel figure out the creation date using content matching.
But lately, I've also begun to download a lot of billing and bank statements online in PDF format. I wanted to save time by programmatically inspecting the newly downloaded PDF files, and conjure up a date created from the metadata within each PDF file.
I found open source, external applications that can do this inspection and extraction of the PDF "creation date". But I was reluctant to install binaries out of an abundance of caution over security. I almost did install one, but my Mac asked me if I was sure I wanted to do that, and warned me against it. So I backed off.
As I've been researching this issue I also came to understand that the Mac includes a native ability to inspect PDF metadata as part of the
Automator app. So this week I've spent some considerable quality time learning about Automator, JavaScript for Applications (JXA), AppleScripting and a bit of Shell scripting.
In the end, I now have a reasonable "Proof of Concept" for a Folder Automation that's assigned to my Downloads folder.
Here's what it does:
Watch for new files in Downloads.
Pause for 4 seconds (to allow downloads to complete)
Filter so only PDF files are acted upon
Run a Javascript to parse the metadata "Creation Date", or optionally use today's date if that's empty
Reformat the Creation date from "Jan 1, 2020" to "2020-01-01"
Move the file out of Downloads, to a destination folder
Prepend that date by renaming the file as "2020-01-01--filename.pdf"
All this using Apple's JXA JavaScript for Applications (something completely new to me, and remarkably poorly documented online).
Here's a screenshot from Automator.

And here is the JavaScript code:
- Code: Select all
function run(input, parameters) {
var app = Application.currentApplication();
app.includeStandardAdditions = true;
var finder = Application("Finder");
var systemEvents = Application("System Events");
var filename = input[0];
var tmpFilename = filename.toLowerCase();
//app.displayDialog("filename is " + filename);
/* ================================================================
Input (from Get PDF Metadata) :
2019-12-25-My Visa Statement.pdf
Dec 22, 2019 at 9:39:25 PM
Converts to:
2019-12-22--2019-12-25-My Visa Statement.pdf
=================================================================== */
try {
var ndex = tmpFilename.lastIndexOf(".pdf");
var filenameAlone = filename.substring(0, ndex+4) //The fifth character is a tab
var nat;
cDate = filename.substring((ndex+5), filename.length);
nat = cDate.lastIndexOf(" at ");
cDate = cDate.substring(0,(nat)) + " 00:00:00"; // we don't care about the time but the date object requires it.
//app.displayDialog("cDate + input = " + cDate + " | " + input)
//we strippped off the filename in front and the tab character after .pdf and the time after the at.
//var dT = new Date(cDate).toString(); //used while debugging to display JavaScript date object as a string
var dT
if (nat == -1) { // Some PDFs do'nt have a creation date in their metadata.
dT = new Date();
} else {
dT = new Date(cDate);
}
var aaaa = dT.getFullYear();
var gg = dT.getDate();
var mm = (dT.getMonth() + 1);
if (gg < 10) { gg = "0" + gg;}
if (mm < 10) { mm = "0" + mm;}
var cDate = aaaa + "-" + mm + "-" + gg;
// app.displayDialog("cDate now equals " + cDate);
// our date now looks like this: 2020-01-01
var newFileName = cDate + "--" + filenameAlone;
// debugging ...
// app.displayDialog("cDate is " + cDate + " filenameAlone is " + filenameAlone);
var sFile= "~/Downloads/" + filenameAlone;
// BTW, JXA doesn't seem to like paths with ~/ tildes in them
/* ===================================================================
First we have to move the file out of the watched Downloads folder
Then rename it, or else looping occurs when the Automator
Folder Action gets triggered by the continuing name changes.
====================================================================== */
/* it would be nice if we didn't have to hard code the paths...
var homeDirectory = finder.startupDisk.folders["Users"].folders["user"]
var sourceFile = homeDirectory.folders["Downloads"].files[filenameAlone]
var destinationFolder = homeDirectory.folders["Destination"]
app.displayDialog("homeDirectory = " + homeDirectory);
*/
var sourceFile = "/Users/dbartholomew/Downloads/" + filenameAlone;
var destinationFolder = "Users/dbartholomew/Desktop/Scan Target/";
// This is the secret sauce... Poorly documented JXA code. It was hard to discover
finder.move(Path(sourceFile), {
to: Path(destinationFolder),
replacing: true
})
theFile = systemEvents.aliases.byName(destinationFolder + filenameAlone);
theFile.name = newFileName; // Rename the file
//app.displayDialog("sFile = " + sFile + "\n : newFileName = " + newFileName);
} catch (error) {
//console.log("Error - File Not Found" + error);
app.displayDialog("Oh Bummer : " + error);
}
//app.displayDialog("cDate + input = " + cDate + " | " + input)
return input;
}
I don't mind that the JavaScript renames the file to append a reverse date to a filename that already contains a reversed date because hazel Rules will rename the file on import based on content matching.
I.E.
2019-12-22--2019-12-25-My Visa Statement.pdf will become
2019-12-22--Daves Signature Visa Statement.pdfThe rule implements a shell script that only looks for the prepended date "2019-12-22--" and ignores the rest of the filename. Bottom line is the filename will be cleaned up in the next step in the workflow.
More work could be done to inspect the filename, and if a date is present in the filename AND the PDF metadata are empty of a date, or the PDF metadata "Creation Date" is newer than the date contained in the filename... then use the date in the filename instead of today's date, as I'm doing here. I've found that some banks don't create the PDF statement until I ask to download it, providing a PDF creation date of today. While other banks generate the PDF on the statement date, and thus contain a Creation Date equal to the statement date. And still other banks (ok my lame Credit Union), has statements with no date at all in the metadata...but the filename contains the statement date.
This work represents several days of tinkering. I'm sure I will refine it over time. I'm new to the tools that I used. But it works!
Hope others find it useful too.
Happy New Year 2020!
-Dave
[Update Jan 5, 2020]. With some use of this new technique for renaming PDFs I've discovered that many of my online statement providers don't create the PDF statement until the moment I request the download. So the date in the PDF metadata is not always the original statement date. I also discovered that Hazel has a fantastic ability to find dates within my PDFs (Thanks EdBurke). (I can't believe I never knew Hazel can do this "Custom Date" date extraction... so cool.) So I've updated my workflow to use the custom date matching to extract the date, then reformat that date as 2020-01-05 for prepending many of my statements and invoices filenames. Here's an article I found that describes how to use Custom Date match
http://www.myproductivemac.com/blog/date-matching-in-hazel2992015The JavaScript-Automator technique I've illustrated above is still useful for anyone who needs an automator folder action that can rename PDFs to the metadata TITLE, rather than the Creation Date. Many online documents are given machine generated filenames that when downloaded, will be better off renamed to the PDF metadata TITLE. Just adjust Automator to extract the Title from the metadata and my script to rename the PDF file accordingly.