Contents contain vs. Contents contain match

Hi All,

My apologies if this question has already been answered. I've spent quite some time googling and also searching this forum but am still unable to get a definitive answer.

Can someone please tell me what the difference is between the "Contents contain" condition vs. the "Contents contain match" condition?

When creating a condition for scanned and OCR'd PDF files, it seems that I am able to use the "Contents contain match" condition to detect account number details, credit card numbers, etc... whereas using the "Contents contain" condition did not work. So I'm a little confused.

Thanks in advance and very much appreciated.

Cheers!!!

If I understand it correctly myself (which I can't guarantee), it works like this:

"Contents Contain" is a simple boolean yes/no- do I find a match for that string in the file or not.
"Contents Contain Match" is similar, in that it returns yes/no based on whether it finds a match or not, but:

Instead of a simple string, "Contain Match" wants a Hazel pattern, like a regular expression, to match against
If the pattern does match the contents of the file, the results of that pattern are captured into a token that can be later used for renaming the file, sorting into subfolders, etc.

Thanks for the response MacPrince.

It's interesting because using "Contents contain" condition, Hazel was unable to find the string I was looking for but when I used the "Contents contain match" condition, Hazel found the same string I was looking for.

The account number I was searching for contained spaces between the numbers so I was wondering if it made a difference in how the PDF file was OCR'd. I use a Fujitsu ScanSnap S1300i and then OCR it via the native application on a Windows machine and save it onto a dropbox folder.

Anyone else with some insights?

Thanks in advance.

"contains" uses Spotlight which indexes words in your file. It is much faster but there may be issues if for some reason Spotlight is not working or misses that file for whatever reason. "contains match" is more reliable since Hazel will go through the contents directly, but at the expense of speed.

For "contains" the test is to search for the terms in Spotlight. If your file doesn't show up there, Hazel won't see it (as far as using "contains" goes).

Mr_Noodle wrote:"contains" uses Spotlight which indexes words in your file. It is much faster but there may be issues if for some reason Spotlight is not working or misses that file for whatever reason. "contains match" is more reliable since Hazel will go through the contents directly, but at the expense of speed.

For "contains" the test is to search for the terms in Spotlight. If your file doesn't show up there, Hazel won't see it (as far as using "contains" goes).

This is quite confusing and also interesting. What other service/tool/helper is hazel using to scan the file with "contains match" | "match contains"? I´m also curious whether "" makes a difference in the contains field. Can I use multiple terms and if so, do I separate them with comma or is it treated like a contained string?

It has an internal tool to extract the text (it's a separate program to work around an Apple bug) but otherwise processing is done within Hazel. All text should be treated as literal, meaning no special punctuation or characters. They are taken as is.