Bash Argument Parsing

Bash is a wonderful and terrible language. It can provide extremely elegant solutions to common text processing and system management tasks, but it can also drag you into the depths of convoluted workarounds to accomplish menial jobs.

Recently I ran into one of these situations when trying to parse varying inputs in a control script. The standard tools for argument parsing were breaking at every turn as I changed argument order and added flag options for flexibility. The remainder of this article details the problems and the solution I came up with for laying argument parsing woes to rest.

Goals

The ideal argument parser will recognize both short and long option flags, preserve the order of positional arguments, and allow both options and arguments to be specified in any order relative to each other. In other words both of these commands should result in the same parsed arguments:

$ ./foo bar -a baz --long thing
$ ./foo -a baz bar --long thing
getopt(s)

The most widely recognized tools for parsing arguments in bash are getopt and getopts. Though both tools are similar in name, they’re very different. getopt is a GNU library that parses argument strings and supports both short and long form flags. getopts is a bash builtin that also parses argument strings but only supports short form flags. Typically, if you need to write a simple script that only accepts one or two flag options you can do something like:

while getopts “:a:bc” opt; do
  case $opt in
    a) AOPT=$OPTARG ;;
  esac
done

This works great for simple option parsing, but things start to fall apart if you want to support long options or mixing options and positional arguments together.

A Better Way

As it turns out, it’s pretty easy to write your own arg parser once you understand the mechanics of the language. Doing so affords you the ability to cast all manner of spells to bend arguments to your will. Here’s a basic example:

#!/bin/bash
PARAMS=""
while (( "$#" )); do
  case "$1" in
    -a|--my-boolean-flag)
      MY_FLAG=0
      shift
      ;;
    -b|--my-flag-with-argument)
      if [ -n "$2" ] && [ ${2:0:1} != "-" ]; then
        MY_FLAG_ARG=$2
        shift 2
      else
        echo "Error: Argument for $1 is missing" >&2
        exit 1
      fi
      ;;
    -*|--*=) # unsupported flags
      echo "Error: Unsupported flag $1" >&2
      exit 1
      ;;
    *) # preserve positional arguments
      PARAMS="$PARAMS $1"
      shift
      ;;
  esac
done
# set positional arguments in their proper place
eval set -- "$PARAMS"

There’s a lot going on here so let’s break it down. First we set a variable PARAMS to save any positional arguments into for later. Next, we create a while loop that evaluates the length of the arguments array and exits when it reaches zero. Inside of the while loop, pass the first element in the arguments array through a case statement looking for either a custom flag or some default flag patterns. If the statement matches a flag, we do something (like the save the value to a variable) and we use the shift statement to pop elements off the front of the arguments array before the next iteration of the loop. If the statement matches a regular argument, we save it into a string to be evaluated later. Finally, after all the arguments have been processed, we set the arguments array to the list of positional arguments we saved using the set command.

With this new knowledge it becomes really easy to write robust and flexible bash utilities that can be deployed to multiple systems without worrying about the order or format of arguments.

Thanks to the following folks for helping make this snippet more robust:
– blogofstuff for suggesting a solution for a looping error

This post originally appeared on Medium.

#engineering