Z-Shell completion system introduction

Introduction

Zsh's completion system (compsys) is one of the most praised and one of the most complex parts of the shell. That's true for users, as well as developers.

People asking for gentle introductions hit the `#zsh' IRC channel on freenode every once in a while; often enough that we added a question about it to the wikifaq (question 15 at the time of writing). The complaint is usually, that `zshcompsys(1)' and `zshcomwid(1)' are dense and dry to read and that they lack a general overview of how the system works.

This is an attempt at such a general overview.

It's a set of functions is what it is

First of all, compsys is made up of functions. Shell functions, like this:

hello() {
    printf 'Hello world.\n'
}

That would create a function called `hello', that can be called just like any other command and it would print "Hello world." to your terminal. The completion system's functions are shell-functions, too. They get called automatically when you hit the tab key and they use a set of special commands, to interact with the shell's line editor, which will in turn present possible completions back to you.

But I've seen parts of the completion system, and that was just code in files, with a funny looking first line...

...I hear you scream.

You're missing the concept of zsh's function path (the `$fpath' array). It is a list of directories, that contain files which contain code for functions that are named just like the file the code came from. If you type "print -l $fpath" at the shell prompt, you should see a list of those directories. For ordinary functions, those directories are not scanned by default, you'll have to tell zsh which files to load code from and that is done by using the `autoload' utility. We could drop a file called `hello' into one of those directories with just one line:

printf 'Hello world.\n'

To have a function called `hello', as soon as we tell zsh to try to load the code from a file in `$fpath' as soon as it is referenced for the first time:

autoload -Uz hello

"-Uz"? Yes, that is "the right thing"[tm] almost always, so I won't discuss it here. See the manual for details.

Minor loadable functions wizardry

Sometimes, when completions get more complex, you need to define additional functions within such a function file. And you can just do that.

However, if you do, it makes sense to also explicitly define a function named like the file too and call that at the very end of the file.

What?

Here is an example (say this file is named "_foo"):

_bar() {
    echo "This is bar()"
}

_foo() {
    _bar
    echo "This is foo()"
}

_foo "$@"

So, there's a function `_foo' defined in a file that would create it while autoloading it. This is useful, to create helper functions within a function file when the function is called for the first time. You can also do initialisations you may need to do only once.

Among others, the _tmux and _git completions do that. The _tmux completion is an example that also does initialisations upon the first call.

The last line is important, it calls the newly defined function with the same arguments as the original function. That makes sure the actual functionality is called when the function runs for the first time, too.

But I don't have to autoload any of the completion functions manually, what gives?

Almost true again. You are calling one function from compsys. And that is `compinit'. It initialises the completion system. In particular, it looks at files in `$fpath', that start with an underscore ("_" - by convention, those are functions that provide completion code for something) and it finds out in which situation to use the code. This is where the weird-looking first line of a command's completion comes into play.

Let's take a look at the completion for the curses-based mail user agent `elm':

#compdef elm

_arguments -s \
  '::recipient:_email_addresses' \
  '-a[use the arrow pointer regardless]' \
  '-A+[attach file]:file attachment:_files' \
  '-c[check the given aliases only]:*:alias' \
  '-d+[set debug level]:debug level' \
  '-f+[specify mailbox to load]:mailbox: _mailboxes' \
  '-h[display help]' \
  '-i+[specify file to include in message]:include file:_files' \
  '-m[turn off menu, using more of the screen]' \
  '-s+[specify a subject]:subject:' \
  "-t[don't use termcap/terminfo ti/te entries]" \
  '-V[enable sendmail voyeur mode]' \
  '-v[display elm version]' \
  '-w[write .elm/elmrc]' \
  '-z[start only if new messages]'

The "#compdef elm" instructs compsys to use the code from that file when the word in command position for the current cursor position is "elm".

The rest of the code is just one call to `_arguments' (a powerful and complex helper function, that is able to deal with many MANY situations where standard getopt(3) as well as GNU-style long-options option handling is involved).

So what happens from the shell's startup to the point where you're doing this:

% elm -A <TAB>

Here is what:

The shell starts and somewhere in your `.zshrc' file you are calling "autoload -Uz compinit; compinit". Now, compsys is online.
While compinit is running it will find the `_elm' file in one directory of `$fpath'.
Compinit will read the first line of that file, find out that it's a completion for a command called "elm" and make a note of that in a mapping for later.
Then compinit will call `autoload' for the "_elm" file, so its code is loaded from the file when it is referenced for the first time.
At the prompt, you typed "elm -A " and pressed the tab key, which will set the completion system in motion.
Compsys recognises, that for the current cursor position the word in command position is "elm".
It'll look up which completion function is in charge for that command in the mapping it made during startup. It'll find that that's "_elm".
That function gets called and (when it is run for the first time) zsh automatically loads its code from the "_elm" file in `$fpath'.
The `_arguments' function analyses the situation and figures out, that it needs to handle an argument to the "-A" option of the command, which it delegates to the `_files' function - as specified in the option's optspec.

The result is that you are being presented with a list of files, which is useful for elm's "-A" option.

And that is also all that's happening. Compsys is just a set of functions of which the code is contained in files, that by convention start with an underscore, in directories, that are listed in the shell's function search path and which are marked for automatic loading during the `compinit' run in your zshrc file.

What about zstyle then. I thought that was the completion system.

Wrong. Common misconception, though. Actually, `zstyle' is not even part of compsys.

It is a system for expressing and storing context-sensitive configuration information. That may sound scary, but it's really not. `zstyle' is just a configuration system.

Compsys happens to use it (but other sub-systems of the shell do, too - like `vcs_info'). That makes sense, because compsys is very context-sensitive. (Legend has it, that the zstyle configuration system was introduced to have a way to express the level of context sensitivity within a configuration system that is demanded by compsys. But that still does not make it part of compsys.)

Systems that use zstyle keep track of the currently active context and describe that context in a string (for example: ":completion::complete:ls::").

`zstyle' gets called like this:

% zstyle "context-pattern" style value

The "context-pattern" is matched against the active context string when the system looks up the value a style. For example:

% zstyle ':completion:*' verbose yes

The completion system always uses ":completion:" as a prefix when looking up styles (the context for compsys can become pretty complex, but that's beyond the scope of this article, see the manual for details), so ":completion:*" will always match when a style is looked up while compsys is running. And so, with the above command anytime the `verbose' style is looked up in compsys, zstyle will yield the value `yes'. Unless there is another setting for the `verbose' style that is more specific, than ":completion:*", like:

% zstyle ':completion:*:complete:ls:*:*' verbose no

The rule of thumb is: The longer the pattern, the more specific it is.

And that's all again.

A style setting with a very general context pattern can be used to set default values. And style settings with more specific patterns can be used to override that default where ever it is needed.

Here are a few example styles, that will give you a reasonable experience from a z-shell instance without any other completion setup:

# Load compsys and one of its fancy modules
zmodload zsh/complist
autoload -Uz compinit
compinit

# And set some styles...
zstyle ':completion:*' completer _complete _approximate
zstyle ':completion:*:descriptions' format "- %d -"
zstyle ':completion:*:corrections' format "- %d - (errors %e})"
zstyle ':completion:*:default' list-prompt '%S%M matches%s'
zstyle ':completion:*' group-name ''
zstyle ':completion:*:manuals' separate-sections true
zstyle ':completion:*:manuals.(^1*)' insert-sections true
zstyle ':completion:*' menu select
zstyle ':completion:*' verbose yes
zstyle ':completion:*' rehash yes
zstyle -e ':completion:*:approximate:*' max-errors \
          'reply=( $(( ($#PREFIX + $#SUFFIX) / 3 )) )'

For practise, find out what the last part of the context pattern in the examples is about (the "descriptions" and "corrections" part). And find the documentation for the styles used in the example (that would be "format", "list-prompt", "menu" etc). And find out what the `-e' option of `zstyle' does.

Putting it all together

I didn't give much hands-on code in this article. And that wasn't the point. The point was to give a very high level idea of how the completion system works. To give you an idea where to look and where to put things.

Say, you want to write a new completion function, for your own awesome program "foobar". You'd pick the completion function name "_foobar" (unless its already taken) and put the following into the first line:

#compdef foobar

And then you can do whatever gets the job done. There is always more than one way to skin a cat when it comes to completion code.

To get on the right track, you should think of a command that already has a completion and resembles the way your command takes its options (if you didn't invent an entirely new way to handle command line arguments, there should be plenty of example code).

There are very short completion functions (like _elm above - that's actually all its code); and there are massively large completion functions, like _git, _perforce or _tmux that jump through every conceivable hoop in order to provide the user with the most accurate and helpful completion candidates possible.

This marks the end of this short introduction. Compsys has much MUCH more to offer. So much more, that you could easily fill an entire book with discussions about its features and best practises. Way too much for a cute little blog article.