Special characters, locales, ENV variable and API calls!

Get help. Get answers. Let others lend you a hand.

Moderator: Mr_Noodle

I use macOS BigSur, latest update

If I execute the following two commands in macOS Terminal, I get the desired results.

Note, the commands are looking for the first air date of an episode from the TV show "Tatort"
- first command: Episode name: "Kehraus" --> Result: 2022-02-27
- second command: Episode name: "Saras Geständnis" --> Result: 2022-02-13

Also note, that the name of the second epsiode contains a German special character.

Code: Select all
curl -L -G https://api.tvmaze.com/shows/6446/episodes | jq --raw-output --arg strSearch "Kehraus" '.[] | select(.name | test([ $strSearch ])) | [.airdate]'

curl -L -G https://api.tvmaze.com/shows/6446/episodes | jq --raw-output --arg strSearch "Saras Geständnis" '.[] | select(.name | test([ $strSearch ])) | [.airdate]'

The commands use CURL and JQ, a JSON processor. As mentioned, in macOS Terminal both commands work fine.

If I put the same command in an "Embedded Shell Script" in Hazel and I trigger the scripts with two different files ("Kehraus.mp4" and "Saras Geständnis.mp4"), I only get a result for the first file, the file that does not contain a special character.

Code: Select all
sourceIn="$1"                        # path and filename/extension of source file
fileIn="$(basename -- $sourceIn)"    # filename/extension of source file
fileNameIn="${fileIn%.*}"            # filename of source file
tvMazeResult=$(curl -L -G https://api.tvmaze.com/shows/6446/episodes | jq --raw-output --arg strSearch $fileNameIn '.[] | select(.name | test([ $strSearch ])) | [.airdate]')
printf "$tvMazeResult ___ \n" > ~/Desktop/_result.txt

The text file _result.txt will contain "2022-02-27" for the file "Kehraus.mp4", but it will be empty for the file "Saras Geständnis.mp4".

In the Hazel log i can't find anything specific about the API calls, but I can see this with a strangely encoded file name.

Code: Select all
2022-09-14 21:06:36.148 86Z3GCJ4MF.com.noodlesoft.HazelHelper[617] DEBUG: Thread 0x7fc532b04370: Received events (
        {
        date = "2022-09-14 19:06:36 +0000";
        path = "/Users/...../Desktop/MediaBox/Renaming/Saras Gesta\U0308ndnis.mp4";
        shouldDoFullScan = 0;
    }
) for stream at path: /Users/...../Desktop/MediaBox/Renaming

But I also see this with a correctly encoded file name.

Code: Select all
2022-09-14 21:06:36.149 86Z3GCJ4MF.com.noodlesoft.HazelHelper[617] DEBUG: Error resolving symlinks for path /Users/...../Desktop/MediaBox/Renaming/Saras Geständnis.mp4:, No such file or directory

So, the log information sometimes writes the file name correctly, and sometimes with encoded special characters.

Executing the command "locale" in the macOS terminal shows this:

Code: Select all
LANG=""
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=

Executing the command "locale" in the Hazel shell environment:

Code: Select all
foo=$(locale)
printf "$foo" > ~/Desktop/_locale.txt

_locale.txt will then show the following

Code: Select all
LANG=""
LC_COLLATE="C"
LC_CTYPE="C"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL="C"

I tried to define the same environment variable as in the macOS Terminal before the script commands

Code: Select all
export LANG="C"
export LC_COLLATE="C"
export LC_CTYPE="UTF-8"
export LC_MESSAGES="C"
export LC_MONETARY="C"
export LC_NUMERIC="C"
export LC_TIME="C"
export LC_ALL=""

sourceIn="$1"                        # path and filename/extension of source file
fileIn="$(basename -- $sourceIn)"    # filename/extension of source file
fileNameIn="${fileIn%.*}"            # filename of source file
tvMazeResult=$(curl -L -G https://api.tvmaze.com/shows/6446/episodes | jq --raw-output --arg strSearch $fileNameIn '.[] | select(.name | test([ $strSearch ])) | [.airdate]')
printf "$tvMazeResult ___ \n" > ~/Desktop/_result.txt

Or I even tried to set the environment variable to German

Code: Select all
export LANG="de_DE.UTF-8"
export LC_COLLATE="de_DE.UTF-8"
export LC_CTYPE="de_DE.UTF-8"
export LC_MESSAGES="de_DE.UTF-8"
export LC_MONETARY="de_DE.UTF-8"
export LC_NUMERIC="de_DE.UTF-8"
export LC_TIME="de_DE.UTF-8"

sourceIn="$1"                        # path and filename/extension of source file
fileIn="$(basename -- $sourceIn)"    # filename/extension of source file
fileNameIn="${fileIn%.*}"            # filename of source file
tvMazeResult=$(curl -L -G https://api.tvmaze.com/shows/6446/episodes | jq --raw-output --arg strSearch $fileNameIn '.[] | select(.name | test([ $strSearch ])) | [.airdate]')
printf "$tvMazeResult ___ \n" > ~/Desktop/_result.txt

But it didn't solve the problem. I still don't get a response from the API when the file name contains a special character.

What am I supposed to do so that Hazel sends the correctly encoded values to the API so I get the same results as in macOS Terminal?

Your feedback would be appreciated.

Thanks

AJ
Last edited by test2000 on Fri Sep 16, 2022 2:20 am, edited 1 time in total.
test2000
 
Posts: 7
Joined: Mon Jan 31, 2022 7:37 am

You don't set those variables in Terminal. Hazel executes scripts in a totally separate context. Set those variables at the beginning of your script.
Mr_Noodle
Site Admin
 
Posts: 11195
Joined: Sun Sep 03, 2006 1:30 am
Location: New York City

Mr_Noodle wrote:You don't set those variables in Terminal. Hazel executes scripts in a totally separate context. Set those variables at the beginning of your script.


The last two Code Boxes in my original post are two examples that I tested as "Embedded Shell Script" in Hazel. The environment variables are set at the beginning of the script. It didn't solve the problem.
test2000
 
Posts: 7
Joined: Mon Jan 31, 2022 7:37 am

Have you tried setting LC_ALL to "de_DE.UTF-8" ?
Mr_Noodle
Site Admin
 
Posts: 11195
Joined: Sun Sep 03, 2006 1:30 am
Location: New York City

Yes, I also tried it with export LC_ALL="de_DE.UTF-8" as well. Leading to same the same error...

Code: Select all
export LANG="de_DE.UTF-8"
export LC_COLLATE="de_DE.UTF-8"
export LC_CTYPE="de_DE.UTF-8"
export LC_MESSAGES="de_DE.UTF-8"
export LC_MONETARY="de_DE.UTF-8"
export LC_NUMERIC="de_DE.UTF-8"
export LC_TIME="de_DE.UTF-8"
export LC_ALL="de_DE.UTF-8"

sourceIn="$1"                        # path and filename/extension of source file
fileIn="$(basename -- $sourceIn)"    # filename/extension of source file
fileNameIn="${fileIn%.*}"            # filename of source file
tvMazeResult=$(curl -L -G https://api.tvmaze.com/shows/6446/episodes | jq --raw-output --arg strSearch $fileNameIn '.[] | select(.name | test([ $strSearch ])) | [.airdate]')
printf "$tvMazeResult ___ \n" > ~/Desktop/_result.txt


In a previous post I wrongly stated that the API doesn't return a result. The API works as expected. The problem is actually that JQ can not process the JSON file returned from the API if the file name contains a german special character.

It can be seen in the log files of Hazel that sometimes file names are logged e.g. as "Saras Geständnis" or sometimes as "Saras Gesta\U0308ndnis". I assume the Hazel shell environment hands over "Saras Gesta\U0308ndnis" to JQ and of course JQ won't find anything like "Saras Gesta\U0308ndnis" in the JSON result.

Once again, the above command works great in the macOS Terminal. The term "Saras Geständins" will return the desired result. Thus, JQ can handle german special characters, and it's not JQ who is causing the error.

The above command also works in the Hazel shell environment, but only if the search term does not contain any special character. It seems clear that it is Hazel handling of special characters that is causing the problem.

Can you please explain how to make that work or correct the Hazel software.

Thank you.
Last edited by test2000 on Tue Oct 04, 2022 2:53 pm, edited 1 time in total.
test2000
 
Posts: 7
Joined: Mon Jan 31, 2022 7:37 am

I suggest checking other environment variables as there may be one you are missing that is critical here.
Mr_Noodle
Site Admin
 
Posts: 11195
Joined: Sun Sep 03, 2006 1:30 am
Location: New York City

Would you mind to share which other variable you mean?

Type "locale" in the Terminal and you will get this list:

Code: Select all
LANG=""
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=


These are the variable that I have set manually in the script.
test2000
 
Posts: 7
Joined: Mon Jan 31, 2022 7:37 am

All environment variables when you run it in Terminal. There is likely something there that might affect the script.
Mr_Noodle
Site Admin
 
Posts: 11195
Joined: Sun Sep 03, 2006 1:30 am
Location: New York City


Return to Support

cron