about summary refs log tree commit diff
path: root/Util/zsh-development-guide
blob: 3775f43b7c50261416a07d31aa90e10e8f201c75 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
------------------------------
GUIDELINES FOR ZSH DEVELOPMENT
------------------------------

Zsh is currently developed and maintained by the Zsh Development Group.
This development takes place by mailing list.  Check the META-FAQ for the
various zsh mailing lists and how to subscribe to them.  The development
is very open and anyone is welcomed and encouraged to join and contribute.
Because zsh is a very large package whose development can sometimes
be very rapid, I kindly ask that people observe a few guidelines when
contributing patches and feedback to the mailing list.  These guidelines
are very simple and hopefully should make for a more orderly development
of zsh.

Patches
-------

* Send all patches to the mailing list rather than directly to me.

* Send only context diffs "diff -c oldfile newfile".  They are much
  easier to read and understand while also allowing the patch program
  to patch more intelligently.  Please make sure the filenames in
  the diff header are relative to the top-level directory of the zsh
  distribution; for example, it should say "Src/init.c" rather than
  "init.c" or "zsh/Src/init.c".

* Please put only one bug fix or feature enhancement in a single patch and
  only one patch per mail message.  This helps me to multiplex the many
  (possibly conflicting) patches that I receive for zsh.  You shouldn't
  needlessly split patches, but send them in the smallest LOGICAL unit.

* If a patch depends on other patches, then please say so.  Also please
  mention what version of zsh this patch is for.

* Please test your patch and make sure it applies cleanly. It takes
  considerably more time to manually merge a patch into the baseline code.

* There is now a zsh patch archive.  To have your patches appear in the
  archive, send them to the mailing list with a Subject: line starting
  with "PATCH:".

C coding style
--------------

* The primary language is ANSI C as defined by the 1989 standard, but the
  code should always be compatible with late K&R era compilers ("The C
  Programming Language" 1st edition, plus "void" and "enum").  There are
  many hacks to avoid the need to actually restrict the code to K&R C --
  check out the configure tests -- but always bear the compatibility
  requirements in mind.  In particular, preprocessing directives must
  have the "#" unindented, and string pasting is not available.

* Conversely, there are preprocessor macros to provide safe access to some
  language features not present in pure ANSI C, such as variable-length
  arrays.  Always use the macros if you want to use these facilities.

* Avoid writing code that generates warnings under gcc with the default
  options set by the configure script.  For example, write
  "if ((foo = bar))" rather than "if (foo = bar)".

* Please try not using lines longer than 79 characters.

* The indent/brace style is Kernighan and Ritchie with 4 characters
  indentations (with leading tab characters replacing sequences of
  8 spaces).  This means that the opening brace is the last character
  in the line of the if/while/for/do statement and the closing brace
  has its own line:

      if (foo) {
	  do that
      }

* Put only one simple statement on a line.  The body of an if/while/for/do
  statement has its own line with 4 characters indentation even if there
  are no braces.

* Do not use space between the function name and the opening parenthesis.
  Use space after if/for/while.  Use space after type casts.

* Do not use (unsigned char) casts since some compilers do not handle
  them properly.  Use the provided STOUC(X) macro instead.

* If you use emacs 19.30 or newer you can put the following line to your
  ~/.emacs file to make these formatting rules the default:

    (add-hook 'c-mode-common-hook (function (lambda () (c-set-style "BSD"))))

* Function declarations must look like this:

  /**/
  int
  foo(char *s, char **p)
  {
      function body
  }

  There must be an empty line, a line with "/**/", a line with the
  type of the function, and finally the name of the function with typed
  arguments.  These lines must not be indented.  The script generating
  function prototypes and the ansi2knr program depend on this format.
  If the function is not used outside the file it is defined in, it
  should be declared "static"; this keyword goes on the type line,
  before the return type.

* Global variable declarations must similarly be preceded by a
  line containing only "/**/", for the prototype generation script.
  The declaration itself should be all on one line (except for multi-line
  initialisers).

* Leave a blank line between the declarations and statements in a compound
  statement, if both are present.  Use blank lines elsewhere to separate
  groups of statements in the interests of clarity.  There should never
  be two consecutive blank lines.

Modules
-------

Modules are described by a file named `foo.mdd' for a module
`foo'. This file is actually a shell script that will sourced when zsh 
is build. To describe the module it can/should set the following shell 
variables:

  - moddeps         modules on which this module depends (default none)
  - nozshdep        non-empty indicates no dependence on the `zsh' pseudo-module
  - alwayslink      if non-empty, always link the module into the executable
  - autobins        builtins defined by the module, for autoloading
  - autoinfixconds  infix condition codes defined by the module, for
                    autoloading (without the leading `-')
  - autoprefixconds like autoinfixconds, but for prefix condition codes
  - autoparams      parameters defined by the module, for autoloading
  - objects         .o files making up this module (*must* be defined)
  - proto           .pro files for this module (default generated from $objects)
  - headers         extra headers for this module (default none)
  - hdrdeps         extra headers on which the .mdh depends (default none)
  - otherincs       extra headers that are included indirectly (default none)

Be sure to put the values in quotes. For further enlightenment have a
look at the `mkmakemod.sh' script in the Src directory of the
distribution.

Modules have to define four functions which will be called automatically
by the zsh core. The first one, named `setup_foo' for a module named
`foo', should set up any data needed in the module, at least any data
other modules may be interested in. The second one, named `boot_foo',
should register all builtins, conditional codes, and function wrappers
(i.e. anything that will be visible to the user) and will be called
after the `setup'-function. 
The third one, named `cleanup_foo' for module `foo' is called when the
user tries to unload a module and should de-register the builtins
etc. The last function, `finish_foo' is called when the module is
actually unloaded and should finalize all the data initialized in the 
`setup'-function. Since the last two functions are only executed when
the module is used as an dynamically loaded module you can surround
it with `#ifdef MODULE' and `#endif'.
In short, the `cleanup'-function should undo what the `boot'-function
did, and the `finish'-function should undo what the `setup'-function
did.
All of these functions should return zero if they succeeded and
non-zero otherwise.

Builtins are described in a table, for example:

  static struct builtin bintab[] = {
    BUILTIN("example", 0, bin_example, 0, -1, 0, "flags", NULL),
  };

Here `BUILTIN(...)' is a macro that simplifies the description. Its
arguments are:
  - the name of the builtin as a string
  - optional flags (see BINF_* in zsh.h)
  - the C-function implementing the builtin
  - the minimum number of arguments the builtin needs
  - the maximum number of arguments the builtin can handle or -1 if
    the builtin can get any number of arguments
  - an integer that is passed to the handler function and can be used
    to distinguish builtins if the same C-function is used to
    implement multiple builtins
  - the options the builtin accepts, given as a string containing the
    option characters (the above example makes the builtin accept the
    options `f', `l', `a', `g', and `s')
  - and finally a optional string containing option characters that
    will always be reported as set when calling the C-function (this,
    too, can be used when using one C-function to implement multiple
    builtins)

The definition of the handler function looks like:

  /**/
  static int
  bin_example(char *nam, char **args, char *ops, int func)
  {
    ...
  }

The special comment /**/ is used by the zsh Makefile to generate the
`*.pro' files. The arguments of the function are the number under
which this function was invoked (the name of the builtin, but for
functions that implement more than one builtin this information is
needed). The second argument is the array of arguments *excluding* the 
options that were defined in the struct and which are handled by the
calling code. These options are given as the third argument. It is an
array of 256 characters in which the n'th element is non-zero if the
option with ASCII-value n was set (i.e. you can easily test if an
option was used by `if (ops['f'])' etc.). The last argument is the
integer value from the table (the sixth argument to `BUILTIN(...)').
The integer return value by the function is the value returned by the
builtin in shell level.

To register builtins in zsh and thereby making them visible to the
user the function `addbuiltins()' is used:

  /**/
  int
  boot_example(Module m)
  {
    int ret;

    ret = addbuiltins(m->nam, bintab, sizeof(bintab)/sizeof(*bintab));
    ...
  }

The arguments are the name of the module (taken from the argument in
the example), the table of definitions and the number of entries in
this table.
The return value is 1 if everything went fine, 2 if at least one
builtin couldn't be defined, and 0 if none of the builtin could be
defined.

To de-register builtins use the function `deletebuiltins()':

  /**/
  int
  cleanup_example(Module m)
  {
    deletebuiltins(m->nam, bintab, sizeof(bintab)/sizeof(*bintab));
    ...
  }

The arguments and the return value are the same as for `addbuiltins()'

The definition of condition codes in modules is equally simple. First
we need a table with the descriptions:

  static struct conddef cotab[] = {
    CONDDEF("len", 0, cond_p_len, 1, 2, 0),
    CONDDEF("ex", CONDF_INFIX, cond_i_ex, 0, 0, 0),
  };

Again a macro is used, with the following arguments:

  - the name of the condition code without the leading hyphen
    (i.e. the example makes the condition codes `-len' and `-ex'
    usable in `[[...]]' constructs)
  - an optional flag which for now can only be CONDF_INFIX; if this is 
    given, an infix operator is created (i.e. the above makes
    `[[ -len str ]]' and `[[ s1 -ex s2 ]]' available)
  - the C-function implementing the conditional
  - for non-infix condition codes the next two arguments give the
    minimum and maximum number of string the conditional can handle
    (i.e. `-len' can get one or two strings); as with builtins giving
    -1 as the maximum number means that the conditional accepts any
    number of strings
  - finally as the last argument an integer that is passed to the
    handler function that can be used to distinguish different
    condition codes if the same C-function implements more than one of 
    them

The definition for the function looks like:

  /**/
  static int
  cond_p_len(char **a, int id)
  {
    ...
  }

The first argument is an array containing the strings (NULL-terminated
like the array of arguments for builtins), the second argument is the
integer value stored in the table (the last argument to `CONDDEF(...)').
The value returned by the function should be non-zero if the condition 
is true and zero otherwise.

Note that no preprocessing is done on the strings. This means that
no substitutions are performed on them and that they will be
tokenized. There are three helper functions available:

  - char *cond_str(args, num)
    The first argument is the array of strings the handler function
    got as an argument and the second one is an index into this array.
    The return value is the num'th string from the array with
    substitutions performed and untokenized.
  - long cond_val(args, num)
    The arguments are the same as for cond_str(). The return value is
    the result of the mathematical evaluation of the num'th string
    form the array.
  - int cond_match(args, num, str)
    Again, the first two arguments are the same as for the other
    functions. The third argument is any string. The result of the
    function is non-zero if the the num'th string from the array taken 
    as a glob pattern matches the given string.

Registering and de-resgitering condition codes with the shell is
almost exactly the same as for builtins, using the functions
`addconddefs()' and `deleteconddefs()' instead:

  /**/
  int
  boot_example(Module m)
  {
    int ret;

    ret = addconddefs(m->nam, cotab, sizeof(cotab)/sizeof(*cotab));
    ...
  }

  /**/
  int
  cleanup_example(Module m)
  {
    deleteconddefs(m->nam, cotab, sizeof(cotab)/sizeof(*cotab));
    ...
  }

Arguments and return values are the same as for the functions for
builtins.

For defining parameters, a module can call `createparam()' directly or 
use a table to describe them, e.g.:

  static struct paramdef patab[] = {
    PARAMDEF("foo", PM_INTEGER, NULL, get_foo, set_foo, unset_foo),
    INTPARAMDEF("exint", &intparam),
    STRPARAMDEF("exstr", &strparam),
    ARRPARAMDEF("exarr", &arrparam),
  };

There are four macros used:

  - PARAMDEF() gets as arguments:
    - the name of the parameter
    - the parameter flags to set for it (from the PM_* flags defined
      in zsh.h)
    - optionally a pointer to a variable holding the value of the
      parameter
    - three functions that will be used to get the value of the
      parameter, store a value in the parameter, and unset the
      parameter
  - the other macros provide simple ways to define the most common
    types of parameters; they get the name of the parameter and a
    pointer to a variable holding the value as arguments; they are
    used to define integer-, scalar-, and array-parameters, so the
    variables whose addresses are given should be of type `long',
    `char *', and `char **', respectively

For a description of how to write functions for getting or setting the 
value of parameters, or how to write a function to unset a parameter,
see the description of the following functions in the `params.c' file:

  - `intvargetfn()' and `intvarsetfn()' for integer parameters
  - `strvargetfn()' and `strvarsetfn()' for scalar parameters
  - `arrvargetfn()' and `arrvarsetfn()' for array parameters
  - `stdunsetfn()' for unsetting parameters

Note that if one defines parameters using the last two macros (for
scalars and arrays), the variable holding the value should be
initialized to either `NULL' or to a a piece of memory created with
`zalloc()'. But this memory should *not* be freed in the
finish-function of the module because that will be taken care of by
the `deleteparamdefs()' function described below.

To register the parameters in the zsh core, the function
`addparamdefs()' is called as in:

  /**/
  int
  boot_example(Module m)
  {
    int ret;

    ret = addparamdefs(m->nam, patab, sizeof(patab)/sizeof(*patab))
    ...
  }

The arguments and the return value are as for the functions used to
add builtins and condition codes and like these, it should be called
in the boot-function of the module. To remove the parameters defined,
the function `deleteparamdefs()' should be called, again with the same 
arguments and the same return value as for the functions to remove
builtins and condition codes:

  /**/
  int
  cleanup_example(Module m)
  {
    deleteparamdefs(m->nam, patab, sizeof(patab)/sizeof(*patab));
    ...
  }

Finally, modules can define wrapper functions. These functions are
called whenever a shell function is to be executed.

The definition is simple:

  static struct funcwrap wrapper[] = {
    WRAPDEF(ex_wrapper),
  };

The macro `WRAPDEF(...)' gets the C-function as its only argument.
This function should be defined like:

  /**/
  static int
  ex_wrapper(List list, FuncWrap w, char *name)
  {
    ...
    runshfunc(list, w, name);
    ...
    return 0;
  }

The first two arguments should only be used to pass them to
`runshfunc()' which will execute the shell function. The last argument 
is the name of the function to be executed. The arguments passed to
the function can be accessed vie the global variable `pparams' (a
NULL-terminated array of strings).
The return value of the wrapper function should be zero if it calls
`runshfunc()' itself and non-zero otherwise. This can be used for
wrapper functions that only need to run under certain conditions or
that don't need to clean anything up after the shell function has
finished:

  /**/
  static int
  ex_wrapper(List list, FuncWrap w, char *name)
  {
    if (wrapper_need_to_run) {
      ...
      runshfunc(list, w, name);
      ...
      return 0;
    }
    return 1;
  }

Inside these wrapper functions the global variable `sfcontext' will be 
set to a vlue indicating the circumstances under which the shell
function was called. It can have any of the following values:

  - SFC_DIRECT:   the function was invoked directly by the user
  - SFC_SIGNAL:   the function was invoked as a signal handler
  - SFC_HOOK:     the function was automatically invoked as one of the
                  special functions known by the shell (like `chpwd')
  - SFC_WIDGET:   the function was called from the zsh line editor as a
                  user-defined widget
  - SFC_COMPLETE: the function was called from the completion code
                  (e.g. with `compctl -K func')

If a module invokes a shell function (e.g. as a hook function), the
value of this variable should only be changed temporarily and restored
to its previous value after the shell function has finished.

There is a problem when the user tries to unload a module that has
defined wrappers from a shell function. In this case the module can't
be unloaded immediately since the wrapper function is still on the
call stack. The zsh code delays unloading modules until all wrappers
from them have finished. To hide this from the user, the module's
cleanup function is run immediatly so that all builtins, condition
codes, and wrapper function defined by the module are
de-registered. But if there is some module-global state that has to be 
finalized (e.g. some memory that has to be freed) and that is used by
the wrapper functions finalizing this data in the cleanup function
won't work.
This is why ther are two functions each for the initialization and
finalization of modules. The `boot'- and `cleanup'-functions are run
whenever the user calls `zmodload' or `zmodload -u' and should only
register or de-register the module's interface that is visible to the
user. Anything else should be done in the `setup'- and
`finish'-functions. Otherwise modules that other modules depend upon
may destroy their state too early and wrapper functions in the latter
modules may stop working since the state they use is already destroyed.

Documentation
-------------

* Edit only the .yo files.  All other formats (man pages, TeXinfo, HTML,
  etc.) are automatically generated from the yodl source.

* Always use the correct markup.  em() is used for emphasis, and bf()
  for citations.  tt() marks text that is literal input to or output
  from the shell.  var() marks metasyntactic variables.

* In addition to appropriate markup, always use quotes (`') where
  appropriate.  Specifically, use quotes to mark text that is not a part
  of the actual text of the documentation (i.e., that it is being quoted).
  In principle, all combinations of quotes and markup are possible,
  because the purposes of the two devices are completely orthogonal.
  For example,

      Type `tt(xyzzy)' to let zsh know you have played tt(advent).
      Saying `plugh' aloud doesn't have much effect, however.

  In this case, "zsh" is normal text (a name), "advent" is a command name
  ocurring in the main text, "plugh" is a normal word that is being quoted
  (it's the user that says `plugh', not the documentation), and "xyzzy"
  is some text to be typed literally that is being quoted.