Writing Good Shell Scripts
Table of Contents
1. Writing good shell scripts shell presentation
1.1. Why shell scripts?
I’m going to be honest, I can’t explain why writing shell scripts is useful, and why using good practices in shell scripts even better but I suspect if you are reading this then are aware of the benefits.
1.2. Bashisms and the POSIX shell
There are some bashisms to be aware of. /bin/sh is not bash. Bash is a
superset. I know we have talked about this but I have to mention it.
- there is no
functionkeyword in/bin/sh - There are no arrays…
- Just because two shells are standard compliant, doesn’t mean they will behave the same
- Bash in POSIX mode only guarantees POSIX code will work. It may not error when seeing Bash only syntax.
- The only way to know if your script is portable is to test it on another system
see more here: https://mywiki.wooledge.org/Bashism
1.3. Avoiding footguns
The shell is dated, and like much software from the 70s and 80s it assumes you the author are doing your due-diligence. This manifests in some strange ways, mostly around missing errors you wouldn’t expect. We can enable flags in our shell to make it stricter.
1.3.1. set -e or error on failure.
set -e is the first flag, which simply terminates the shell script if any
command returns non-zero (remember zero means success in the shell). If you
expect a command might fail and want to handle it manually you still can:
do-something || {
do-some-prep
do-something
}
1.3.2. set -u or undefined variables
Traditionally if you try to access a variable which does exist, the shell
will provide an empty string. this might be what you want to happen but
most likely you are expecting a variable to be set. set -u instructs the
shell to treat the access of undefined variables as an error.
1.3.3. set -o pipefail or not hiding errors
Say you have a string of commands foo | bar | baz, this will only “error”
if baz returns a non-zero value. This means that foo or bar could fail
and this would be silent, even with set -e. set -o pipefail ensures that
if any command in a pipeline fails, the whole pipeline is marked as an error
which can be bubbled up and handled.
1.3.4. Put it all together
We can put a single line at the top of our script: set -euo pipefail to
ensure all of these get set.
1.4. Using trap
intercepting signals is helpful, perhaps you want to log some telemetry, or at
the very least have some consistent error messages for your user. This can be
done very easily with trap.
# The following function will get run if we error
handler() {
echo "We ran into an error, call 207-876-5309 to report the issue"
}
trap handler ERR
If you want to use some bashisms, you can use $BASH_LINENO and $BASH_COMMAND
1.5. Treat it like a real programming language (with love ❤)
1.5.1. Use functions
It’s very easy to write a shell script simply as a list of commands we want to run. This can be true, but it can also obfuscate the structure of what we want to do. Suppose you are writing a script to scrape images from a website, and it you have thread broad steps: collection, processing, and naming. Each of these steps may consist of a large (and varying) number of commands, along with some setup for what we’re scraping and where to store things etc. It may feel easy to just dump all of the commands in a file in a row, maybe you add some comments saying which part is doing what maybe not.
But I would suggest that it takes < 5 more minutes to put each section into some well named functions:
collect_images() {
...
curl "$img" > "${TMPDIR}/${img}"
...
}
process_images() {
...
convert "${TMPDIR}/${img}" -resize 1000x1000 "${PROCESS_DIR}/${img}"
...
}
organize_images() {
...
find "${PROCESS_DIR}/" -exec ....
...
}
collect_images
process_images
organize_images
This allows the reader (probably you in a few months) to be able to at glance know what the chunk of code they are looking at is intended to do. So if you know the program is failing to organize images for some reason, you can easily jump to that part and avoid sifting through the code to find the organization section.
1.5.2. Parse arguments well and have a help function
There is nothing more frustrating then trying to use a script and having it
provide a cryptic shell error because the script is trying to use a parameter
I didn’t pass, or I passed wrong. To parse arguments well, I prefer using using
a while loop, case and shift. It can be verbose but it’s the most robust
way to handle it in the shell.
usage() {
echo "foo.sh"
echo "Options"
echo "-o,--output [file] set some output"
echo "-d,--dryrun show what would happen"
echo "-h,--help show this help message"
}
while [ "$#"" -gt 0 ]; do
case "$1" in
-d|--dryrun)
dryrun=1
shift
;;
-o|--output)
if [ -z "$2" ]; then
echo "Error: $1 expects a file!"
exit 1
fi
output="$2"
shift 2 # Note we shift by 2 here!
;;
-h|--help)
usage
;;
*)
echo "Unexpected argument $1"
usage
;;
esac
done
The case syntax is awkward, but I’ve found this to be the most robust way handle parsing arguments for complicated scripts
1.5.3. Know and understand the various substitutions available.
There are many parameter expansions in the posix shell, a good one to know is the use default value substitution. To show this in action lets look at a function which takes an optional parameter, a naive approach may be:
say_hi_to() {
if [ -z "$1" ]; then
name="$1"
else
name="John Doe"
fi
echo "Hi $name"
}
say_hi_to chris # prints 'Hi chris'
say_hi_to # prints 'Hi John Doe'
But we can reduce this to a single line using the parameter expansion system:
say_hi_to() {
echo "Hi ${1:-John Doe}"
}
see the spec for more info: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_02
1.6. My preferences in shell scripting
Below are some of preferences for writing shell scripts, these are mostly nitpick things, feel free to disagree as they are mostly cosmetic.
1.6.1. Use short-circuiting logic instead of if/else
instead of doing
if ! some_command; then handle_it fi
I tend to prefer
some_command || handle_it
You may also use a block of commands in the second half as well:
some_command || {
first_handler
second_handler
}
I find this less visually noisy then the if ...; then ... fi syntax.
1.6.2. Pipes at the start of the line
For long pipelines it can be nice to break it up onto multiple lines. Most examples you see will have the pipes on the end of the line like so:
some_command | \ foo | \ bar -x -y | \ baz --whack |
I prefer the following form better:
some_command \ | foo \ | bar -x -y \ | baz --whack
Note, I’ve also aligned the backslash’s. I like things being aligned. I also like the pipes at the beginning of the line because it makes me feel like a cool Haskell programmer (which I am not)! Also rust match statements with many options also look like this, so maybe that’s why I like it.
1.7. Use Shellcheck
Shellcheck (https://www.shellcheck.net/) is a lovely tool written in Haskell to parse and lint shell scripts for common issues,
1.8. Further reading
- The Advanced Bash Scripting Guide: https://tldp.org/LDP/abs/html/
- IEEE Std 1003.1-2017 (POSIX spec for the Shell Command Language) https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html
- POSIX shell scripting guide: https://github.com/cloudstreet-dev/POSIX-Shell-Scripting/tree/main
man dash,man bash