Page 1 of 1
How Do I combine PDFs based on distinctive number?

Posted:
Wed Aug 21, 2024 7:02 pm
by davidelloyd
I have two PDF documents related to the same claim. The first one, already processed by Hazel, contains the claim number. The second one, arriving later, has additional claim details. I'd like to automatically merge these two PDFs into a single file whenever the second document becomes available. I don't have experience with automation tools or coding.
Hazel App has already placed the claim number in the comments of the first PDF like this: "Claim Number: 01-23-862-5607-1"
Any assistance would be greatly appreciated!
Re: How Do I combine PDFs based on distinctive number?

Posted:
Thu Aug 22, 2024 8:35 am
by Mr_Noodle
Hazel does not have an action to combine two PDF documents. You will need to use a script, Shortcuts or Automator to handle that part. Where is the claim number stored in the two documents?
Re: How Do I combine PDFs based on distinctive number?

Posted:
Thu Aug 22, 2024 4:09 pm
by davidelloyd
Mr_Noodle wrote:Hazel does not have an action to combine two PDF documents. You will need to use a script, Shortcuts or Automator to handle that part. Where is the claim number stored in the two documents?
Hazel App has already placed the claim number in the comments of the first PDF like this: "Claim Number: 01-23-862-5607-1"
The current code I'm utilizing is this, but it doesn't seem to work:
- Code: Select all
#!/bin/bash
# Directory where Hazel places the processed PDFs
hazel_output_dir="/Users/dj/Library/CloudStorage/Dropbox/documents/My Scans/Insurance/Ameritas"
# Directory where the second PDFs with additional details arrive
incoming_details_dir="/Users/dj/Library/CloudStorage/Dropbox/Mac (2)/Downloads"
# Directory to store the merged PDFs
merged_output_dir="/Users/dj/Library/CloudStorage/Dropbox/documents/My Scans/Insurance/Ameritas"
# Function to extract claim number from PDF file comments metadata.
extract_claim_number() {
pdf_file="$1"
claim_number=$(pdftk "$pdf_file" dump_data | grep "Claim Number:" | awk '{print $3}')
echo "$claim_number"
}
# Main loop to monitor for new PDFs and merge
while true; do
# Find all PDFs in the incoming details directory
for details_pdf in "$incoming_details_dir"/*.pdf; do
# Extract claim number from the details PDF (you might need to adjust this based on how the claim number is presented in the second PDF)
details_claim_number=$(extract_claim_number "$details_pdf")
# Find the corresponding PDF in Hazel output based on claim number
for hazel_pdf in "$hazel_output_dir"/*.pdf; do
hazel_claim_number=$(extract_claim_number "$hazel_pdf")
if [ "$hazel_claim_number" == "$details_claim_number" ]; then
# Merge the PDFs using GhostScript (install if needed: `brew install ghostscript`)
gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile="$merged_output_dir/${hazel_claim_number}_merged.pdf" "$hazel_pdf" "$details_pdf"
# We need to create code to delete the processed PDFs after combining them.
# mv "$hazel_pdf" /path/to/archive
# mv "$details_pdf" /path/to/archive
echo "Merged claim: $hazel_claim_number"
fi
done
done
# Sleep for some time before checking again (adjust as needed)
sleep 60
done
Re: How Do I combine PDFs based on distinctive number?

Posted:
Fri Aug 23, 2024 9:39 am
by Mr_Noodle
You can have Hazel do the claim number extraction and general matching but in the end, you will need a script to combine the PDFs. Note that you can only pass in attributes as arguments for Apple/JavaScript, though.
Re: How Do I combine PDFs based on distinctive number?

Posted:
Fri Aug 23, 2024 11:06 am
by davidelloyd
Mr_Noodle wrote:You can have Hazel do the claim number extraction and general matching but in the end, you will need a script to combine the PDFs. Note that you can only pass in attributes as arguments for Apple/JavaScript, though.
Did you not see the script I added.
Re: How Do I combine PDFs based on distinctive number?

Posted:
Mon Aug 26, 2024 8:39 am
by Mr_Noodle
Yes, and you are doing Hazel's work of look for new files. Not sure why you are using Hazel here since the script is doing all that work.