Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
On env Shebangs (scriptingosx.com)
47 points by zdw on April 15, 2022 | hide | past | favorite | 32 comments


"Neither env shebangs nor absolute path shebangs are generally ‘better.’ Either one may get you in trouble when used in the wrong context."

I generally agree on the wisdom of "it depends and consider context", but..

An absolute path is going to assume others have the same setup.

The downside claimed for using #!/usr/bin/env seems to be:

- that the PATH needs to be setup correctly for interactive/non-interactive shells. (Which will be true of just executing that command in a script, anyway).

- and that the program might be a different version than expected (which an absolute path doesn't solve either).

Using #!/usr/bin/env is a much better default to go with, and will be the better option almost all the time.


One use case where #!/usr/bin/env can go wrong is with cron jobs. They don't typically pick up the same path as an interactive login or shell. So it's easy, for example, to get get a python 2 interpreter when you were expecting python 3 with "#!/usr/bin/env python"


You should always explicitly use `python3` rather than `python` and hoping it is version 3.


Well, even then, you may not get the python3 you expect with "#!/usr/bin/env python3" in a cron job. I imagine a fair amount of this is happening now, with Xcode installing it's own python3 on a bunch of Macs.


> An absolute path is going to assume others have the same setup.

Certain absolute path locations are standardized by POSIX. Including the absolute path of /usr/bin/env.

> and that the program might be a different version than expected (which an absolute path doesn't solve either).

...and those programs in those standard locations are required by those same standards to behave ABI-compatibly with their behavior at time of standardization.

Which is why e.g. /bin/bash bothers to strip out features when called as /bin/sh — the interpreter at /bin/sh must be no less and no more than the particular feature-set of the Bourne shell version that POSIX was standardized upon.


A cool thing rpm(build) does, is that is automatically detects shebangs of packaged files and makes the built package depend on whatever package provides the file the shebang points to. This obviously cannot work for env shebangs, so you always have to do that find -exec sed dance to replace them with proper ones for software that decides to have env shebangs.

But on Linux we also don't have to fight with a 15 year old bash or "compromising" a system when installing something into /bin. It feels lile OS X genuinely doesn't sound like a good UNIX to administer.


And what of other dependencies? Imagemagick, say? There's another way to achieve portability and exactness, although it requires some setup.

  #!/usr/bin/env nix-shell
  #!nix-shell -i bash -p imagemagick
Voila. A (mostly) self-contained shell script, dependencies auto-installed when needed.


> another way to achieve portability

    $ nix-shell
    nix-shell: command not found


I came here to say this, too :)

But, of course, it isn't a silver bullet...

1. You still have to have a sane PATH. A fair amount of the Nix install-related issues that get opened are PATH problems, and you can also run into problems with PATH in cron/launchd (as @tyingq noted).

2. You still have to know what the script depends on. This can get tricky beyond small scripts you wrote yourself. (I write a tool for discovering and ~linking/resolving external dependencies in Shell scripts, https://github.com/abathur/resholve. As I've been working on converting some of nixpkgs' existing Shell packages to use it, I almost always find dependencies the initial packager missed.)


Re. 2-

If you invoke nix-shell with --pure, then it'll erase most user state — PATH included — prior to invoking the script. I don't find much use for this, but it would be useful to ensure nothing that isn't an explicit dependency gets used.


Yes, `--pure` helps (and `env -i` can also help) but that's not the hard part of finding them all. It only really helps when there's no conditional logic or no options that can lead to the execution of different programs.


Interesting.

I've resisted the pull of the Nix side so far...

There's a similar package that auto-pulls dependencies for Swift scripts, but it requires you to explicitly state the external dependencies as comment annotations. The example given is:

    #!/usr/bin/swift sh
    
    import Foundation
    import PromiseKit  // @mxcl ~> 6.5
    
    firstly {
        after(.seconds(2))
    }.then {
        after(.milliseconds(500))
    }.done {
        print("notice: two and a half seconds elapsed")
        exit(0)
    }
    
    RunLoop.main.run()


https://github.com/mxcl/swift-sh


I'd never use /usr/bin/env to find a shell. Just #!/bin/sh and write portable code, end of story.

The env trick is useful when you have some programming language that is not commonly packaged. Users might have it in /usr/local/bin or some completely personal place that's in their PATH: /home/joe/bin or what have you.


The git project uses shell extensively and the resulting tools are portable to just about every unix under the sun and the Git for Windows bash environment.

Their shell script guidelines are pretty helpful for someone wanting to familiarize themselves with portable posix shell scripting. There's some style nits in there too, but the other stuff can be helpful if you're looking for a well-traveled path.

https://github.com/git/git/blob/v2.35.0/Documentation/Coding...


The overwhelming majority of scripts with such a shebang only works on bash with GNU tools. Trying to run them on dash or with BSD tools is often impossible. When it is compatible, it often has critical bugs due to the limitations of error handling (pipefail and $PIPESTATUS are both nash extensions).

Please do write Python/Perl instead.


I can easily write a shell script by following the 1997 Single Unix Specification Issue 2:

https://pubs.opengroup.org/onlinepubs/007908799/

put it into a #!/bin/sh file and have it work everywhere.

Good luck with #!/usr/bin/env python followed by code written to 1997 specs.


When people say "portable" they don't usually have 1997 computers in mind. Following the old spec is good to have broad support on today's machines but is not in itself a target.

A script written for Python 3.5 (which was released 7 years ago) is very portable and will even work on non-Unix platforms like Windows. If it's not there, it is easier to compile than the entire suite of Unix tools. In addition, it has proper error handling...


The nice thing about Python is that, for the most part (see below for exceptions), old 3.x code is forward-compatible to new code. So a basic Python 3 script written in 2008 should still run fine in 2022.

Exceptions:

Asyncio has been in active development and was considered "provisional" for a long time, but is now stabilizing. Asyncio code that predates async/await syntax will not run on new versions of Python (this maybe can be converted with automated tools).

Type annotations have also been in active development and was considered "provisional" for a long time, and is also now stabilizing. In a future version (maybe 3.11?) a lot of `from typing import ___` imports will break and need to be replaced (this can mostly be converted with automated tools).

Several modules in the standard library are considered "dead batteries" and will be removed in an upcoming version. See <https://peps.python.org/pep-0594/>, and for discussion <https://discuss.python.org/t/pep-594-take-2-removing-dead-ba...>. This is probably the worst aspect for forward-compatibility, but it seems necessary to ensure the health of the CPython project going forward (unless of course enterprises decide to start contributing back some faction of the billions upon billions that they make on the backs of the Python Software Foundation every year). Expect community members to spin off these libraries


It's also worth mentioning that there are "higher-level" languages than C, which produce portable-ish native binaries (modulo some dynamic linking dependencies, which isn't that different from depending on a Python or Perl runtime).


In some cases the env shebang is basically mandatory. If you use virtualenv in a Jenkins job's workspace with a long job name, you can end up with shebangs that are longer than the kernel can execute (about 127 characters on most systems[1]). The solution is to rewrite the virtualenv-created scripts to use the env shebang, and set the PATH to the virtualenv bin directory before executing the scripts.

[1] https://stackoverflow.com/questions/10813538/shebang-line-li...


Sadly you can only portably have one argument in the shebang line to the interpreter. That means that if you wanted to pass an argument to the actual interpreter, you'd not be able to if you use the env trick.


Is this not what you are looking for? From the man page of env(1)

> The -S option allows specifing [sic] multiple parameters in a script.

> [...]

> Without the '-S' parameter the script will likely fail with:

> . . . /usr/bin/env: 'perl -w -T': No such file or directory

I actually ran into this problem last week and just used the absolute path with an argument (/usr/bin/python3 -u) instead of passing an argument to env, and I added a TODO comment to figure out how to do this better. Now you made me find the right way :)


The usual next difficulty is that the maximum shebang length is fairly low (truncated at 128 chars, IIRC)


It was funny to notice my mental knee-jerk reaction of "ought to be enough for anyone" (referring to 640K RAM). I've never used args with env until last week in the first place and even then it was a (mental) debate of solving it in code (wrapping print() in a function that also flushes stdout) or adding the argument instead, so I guess that's why that was my initial thought. But of course that doesn't mean there can't be legitimate uses of long argument lists. Java code comes to mind (although I do question the sanity of many things Java).


Ahh, thanks!


This is the fault of interpreters. Interpreters receive the entire file as input, with the shebang line and all. That can easily support the encoding of additional parameters.

In TXR Lisp, I invented a novel way of doing this, which I call the "hash bang null hack".

If you put a NUL character into the hash bang line, then TXR will process arguments after that character. The NUL character is a string terminator, and so the operating system kernel doesn't see it; it thinks that's the end of the hash bang line.

So for instance (^@ represents NUL):

#!/usr/bin/env txr^@arg arg arg

This hack is not necessary, of course; a given language interpreter can do this in any number of ways. E.g. arguments in a comment in the following line. The convention could be, say, that if the next line also starts with #!, it holds parameters:

  #!/usr/bin/env my-language
  #!--args to my language
I didn't do it this way because the language doesn't use # comments (so hacking argument recognition into comment processing is out) and I didn't want to complicate the hash bang logic to dealing with two or more lines: everything is handled in the one line and that's it.

Another solution is to rely on env having the BSD -S option, which specifies a string that becomes multiple arguments.

  #!/usr/bin/env -S program --option arg
the hash bang mechanism will pass "program --option arg" as a single argument to env, whose -S option will split that into multiple arguments. You might run into a limit on the hash bang header. E.g. if the given operating system reads only, say, 32 bytses of the file (really ancient Unix system), in which it looks for the hash bang line, then you have only that much space to cram into that -S argument.

The -S option implemented in BSD unixes has a lot of bells and whistles: you can use escape characters in it to encode control characters and $ to expand environment variables. E.g. silly example:

   #!/usr/bin/env -S $USER/bin/interp
would use every user's own "interp" in their own bin directory to run the program.

-S is not portable; it's not specified in POSIX. Newer versions of GNU Coreutils have it (but not with all the BSD features, I think). If you maintain a programing language, it behooves you to solve this problem for the users, so the only nonstandard thing they rely on is your language, without any collateral nonstandard stuff.


Assuming bash, you can specify arguments using "set". For example:

  #!/usr/bin/env bash
  set -eu
  
  # Your script here


/usr/bin/env is also the only shebang you can really use on nixOS AFAIK


#!/bin/sh is also available


> The /usr/bin/env shebang provides a means for system portability.

No it doesn’t. It’s just there to specify which interpreter to use instead of the current shell, so you can rub it like any other executable. Has nothing to do with portability.


You're talking about shebangs in general. This is about a specific shebang idiom, i.e

  #!/usr/bin/env python3
which allows the file to be run with whichever python interpreter the user has in their PATH, instead of requiring the interpreter to be in a particular location.


Oh yeah you're right.. Didn't actually notice env xxx




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: