summary refs log tree commit diff
path: root/manual/setjmp.texi
blob: b924d582b1a40502daf08afd990ab27062b8c10a (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
@node Non-Local Exits, Signal Handling, Resource Usage And Limitation, Top
@c %MENU% Jumping out of nested function calls
@chapter Non-Local Exits
@cindex non-local exits
@cindex long jumps

Sometimes when your program detects an unusual situation inside a deeply
nested set of function calls, you would like to be able to immediately
return to an outer level of control.  This section describes how you can
do such @dfn{non-local exits} using the @code{setjmp} and @code{longjmp}
functions.

@menu
* Intro: Non-Local Intro.        When and how to use these facilities.
* Details: Non-Local Details.    Functions for non-local exits.
* Non-Local Exits and Signals::  Portability issues.
* System V contexts::            Complete context control a la System V.
@end menu

@node Non-Local Intro, Non-Local Details,  , Non-Local Exits
@section Introduction to Non-Local Exits

As an example of a situation where a non-local exit can be useful,
suppose you have an interactive program that has a ``main loop'' that
prompts for and executes commands.  Suppose the ``read'' command reads
input from a file, doing some lexical analysis and parsing of the input
while processing it.  If a low-level input error is detected, it would
be useful to be able to return immediately to the ``main loop'' instead
of having to make each of the lexical analysis, parsing, and processing
phases all have to explicitly deal with error situations initially
detected by nested calls.

(On the other hand, if each of these phases has to do a substantial
amount of cleanup when it exits---such as closing files, deallocating
buffers or other data structures, and the like---then it can be more
appropriate to do a normal return and have each phase do its own
cleanup, because a non-local exit would bypass the intervening phases and
their associated cleanup code entirely.  Alternatively, you could use a
non-local exit but do the cleanup explicitly either before or after
returning to the ``main loop''.)

In some ways, a non-local exit is similar to using the @samp{return}
statement to return from a function.  But while @samp{return} abandons
only a single function call, transferring control back to the point at
which it was called, a non-local exit can potentially abandon many
levels of nested function calls.

You identify return points for non-local exits by calling the function
@code{setjmp}.  This function saves information about the execution
environment in which the call to @code{setjmp} appears in an object of
type @code{jmp_buf}.  Execution of the program continues normally after
the call to @code{setjmp}, but if an exit is later made to this return
point by calling @code{longjmp} with the corresponding @w{@code{jmp_buf}}
object, control is transferred back to the point where @code{setjmp} was
called.  The return value from @code{setjmp} is used to distinguish
between an ordinary return and a return made by a call to
@code{longjmp}, so calls to @code{setjmp} usually appear in an @samp{if}
statement.

Here is how the example program described above might be set up:

@smallexample
@include setjmp.c.texi
@end smallexample

The function @code{abort_to_main_loop} causes an immediate transfer of
control back to the main loop of the program, no matter where it is
called from.

The flow of control inside the @code{main} function may appear a little
mysterious at first, but it is actually a common idiom with
@code{setjmp}.  A normal call to @code{setjmp} returns zero, so the
``else'' clause of the conditional is executed.  If
@code{abort_to_main_loop} is called somewhere within the execution of
@code{do_command}, then it actually appears as if the @emph{same} call
to @code{setjmp} in @code{main} were returning a second time with a value
of @code{-1}.

@need 250
So, the general pattern for using @code{setjmp} looks something like:

@smallexample
if (setjmp (@var{buffer}))
  /* @r{Code to clean up after premature return.} */
  @dots{}
else
  /* @r{Code to be executed normally after setting up the return point.} */
  @dots{}
@end smallexample

@node Non-Local Details, Non-Local Exits and Signals, Non-Local Intro, Non-Local Exits
@section Details of Non-Local Exits

Here are the details on the functions and data structures used for
performing non-local exits.  These facilities are declared in
@file{setjmp.h}.
@pindex setjmp.h

@comment setjmp.h
@comment ISO
@deftp {Data Type} jmp_buf
Objects of type @code{jmp_buf} hold the state information to
be restored by a non-local exit.  The contents of a @code{jmp_buf}
identify a specific place to return to.
@end deftp

@comment setjmp.h
@comment ISO
@deftypefn Macro int setjmp (jmp_buf @var{state})
@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
@c _setjmp ok
@c  __sigsetjmp(!savemask) ok
@c   __sigjmp_save(!savemask) ok, does not call sigprocmask
When called normally, @code{setjmp} stores information about the
execution state of the program in @var{state} and returns zero.  If
@code{longjmp} is later used to perform a non-local exit to this
@var{state}, @code{setjmp} returns a nonzero value.
@end deftypefn

@comment setjmp.h
@comment ISO
@deftypefun void longjmp (jmp_buf @var{state}, int @var{value})
@safety{@prelim{}@mtsafe{}@asunsafe{@ascuplugin{} @asucorrupt{} @asulock{/hurd}}@acunsafe{@acucorrupt{} @aculock{/hurd}}}
@c __libc_siglongjmp @ascuplugin @asucorrupt @asulock/hurd @acucorrupt @aculock/hurd
@c  _longjmp_unwind @ascuplugin @asucorrupt @acucorrupt
@c   __pthread_cleanup_upto @ascuplugin @asucorrupt @acucorrupt
@c     plugins may be unsafe themselves, but even if they weren't, this
@c     function isn't robust WRT async signals and cancellation:
@c     cleanups aren't taken off the stack right away, only after all
@c     cleanups have been run.  This means that async-cancelling
@c     longjmp, or interrupting longjmp with an async signal handler
@c     that calls longjmp may run the same cleanups multiple times.
@c    _JMPBUF_UNWINDS_ADJ ok
@c    *cleanup_buf->__routine @ascuplugin
@c  sigprocmask(SIG_SETMASK) dup @asulock/hurd @aculock/hurd
@c  __longjmp ok
This function restores current execution to the state saved in
@var{state}, and continues execution from the call to @code{setjmp} that
established that return point.  Returning from @code{setjmp} by means of
@code{longjmp} returns the @var{value} argument that was passed to
@code{longjmp}, rather than @code{0}.  (But if @var{value} is given as
@code{0}, @code{setjmp} returns @code{1}).@refill
@end deftypefun

There are a lot of obscure but important restrictions on the use of
@code{setjmp} and @code{longjmp}.  Most of these restrictions are
present because non-local exits require a fair amount of magic on the
part of the C compiler and can interact with other parts of the language
in strange ways.

The @code{setjmp} function is actually a macro without an actual
function definition, so you shouldn't try to @samp{#undef} it or take
its address.  In addition, calls to @code{setjmp} are safe in only the
following contexts:

@itemize @bullet
@item
As the test expression of a selection or iteration
statement (such as @samp{if}, @samp{switch}, or @samp{while}).

@item
As one operand of an equality or comparison operator that appears as the
test expression of a selection or iteration statement.  The other
operand must be an integer constant expression.

@item
As the operand of a unary @samp{!} operator, that appears as the
test expression of a selection or iteration statement.

@item
By itself as an expression statement.
@end itemize

Return points are valid only during the dynamic extent of the function
that called @code{setjmp} to establish them.  If you @code{longjmp} to
a return point that was established in a function that has already
returned, unpredictable and disastrous things are likely to happen.

You should use a nonzero @var{value} argument to @code{longjmp}.  While
@code{longjmp} refuses to pass back a zero argument as the return value
from @code{setjmp}, this is intended as a safety net against accidental
misuse and is not really good programming style.

When you perform a non-local exit, accessible objects generally retain
whatever values they had at the time @code{longjmp} was called.  The
exception is that the values of automatic variables local to the
function containing the @code{setjmp} call that have been changed since
the call to @code{setjmp} are indeterminate, unless you have declared
them @code{volatile}.

@node Non-Local Exits and Signals, System V contexts, Non-Local Details, Non-Local Exits
@section Non-Local Exits and Signals

In BSD Unix systems, @code{setjmp} and @code{longjmp} also save and
restore the set of blocked signals; see @ref{Blocking Signals}.  However,
the POSIX.1 standard requires @code{setjmp} and @code{longjmp} not to
change the set of blocked signals, and provides an additional pair of
functions (@code{sigsetjmp} and @code{siglongjmp}) to get the BSD
behavior.

The behavior of @code{setjmp} and @code{longjmp} in @theglibc{} is
controlled by feature test macros; see @ref{Feature Test Macros}.  The
default in @theglibc{} is the POSIX.1 behavior rather than the BSD
behavior.

The facilities in this section are declared in the header file
@file{setjmp.h}.
@pindex setjmp.h

@comment setjmp.h
@comment POSIX.1
@deftp {Data Type} sigjmp_buf
This is similar to @code{jmp_buf}, except that it can also store state
information about the set of blocked signals.
@end deftp

@comment setjmp.h
@comment POSIX.1
@deftypefun int sigsetjmp (sigjmp_buf @var{state}, int @var{savesigs})
@safety{@prelim{}@mtsafe{}@asunsafe{@asulock{/hurd}}@acunsafe{@aculock{/hurd}}}
@c sigsetjmp @asulock/hurd @aculock/hurd
@c  __sigsetjmp(savemask) @asulock/hurd @aculock/hurd
@c   __sigjmp_save(savemask) @asulock/hurd @aculock/hurd
@c    sigprocmask(SIG_BLOCK probe) dup @asulock/hurd @aculock/hurd
This is similar to @code{setjmp}.  If @var{savesigs} is nonzero, the set
of blocked signals is saved in @var{state} and will be restored if a
@code{siglongjmp} is later performed with this @var{state}.
@end deftypefun

@comment setjmp.h
@comment POSIX.1
@deftypefun void siglongjmp (sigjmp_buf @var{state}, int @var{value})
@safety{@prelim{}@mtsafe{}@asunsafe{@ascuplugin{} @asucorrupt{} @asulock{/hurd}}@acunsafe{@acucorrupt{} @aculock{/hurd}}}
@c Alias to longjmp.
This is similar to @code{longjmp} except for the type of its @var{state}
argument.  If the @code{sigsetjmp} call that set this @var{state} used a
nonzero @var{savesigs} flag, @code{siglongjmp} also restores the set of
blocked signals.
@end deftypefun

@node System V contexts,, Non-Local Exits and Signals, Non-Local Exits
@section Complete Context Control

The Unix standard provides one more set of functions to control the
execution path and these functions are more powerful than those
discussed in this chapter so far.  These function were part of the
original @w{System V} API and by this route were added to the Unix
API.  Beside on branded Unix implementations these interfaces are not
widely available.  Not all platforms and/or architectures @theglibc{}
is available on provide this interface.  Use @file{configure} to
detect the availability.

Similar to the @code{jmp_buf} and @code{sigjmp_buf} types used for the
variables to contain the state of the @code{longjmp} functions the
interfaces of interest here have an appropriate type as well.  Objects
of this type are normally much larger since more information is
contained.  The type is also used in a few more places as we will see.
The types and functions described in this section are all defined and
declared respectively in the @file{ucontext.h} header file.

@comment ucontext.h
@comment SVID
@deftp {Data Type} ucontext_t

The @code{ucontext_t} type is defined as a structure with as least the
following elements:

@table @code
@item ucontext_t *uc_link
This is a pointer to the next context structure which is used if the
context described in the current structure returns.

@item sigset_t uc_sigmask
Set of signals which are blocked when this context is used.

@item stack_t uc_stack
Stack used for this context.  The value need not be (and normally is
not) the stack pointer.  @xref{Signal Stack}.

@item mcontext_t uc_mcontext
This element contains the actual state of the process.  The
@code{mcontext_t} type is also defined in this header but the definition
should be treated as opaque.  Any use of knowledge of the type makes
applications less portable.

@end table
@end deftp

Objects of this type have to be created by the user.  The initialization
and modification happens through one of the following functions:

@comment ucontext.h
@comment SVID
@deftypefun int getcontext (ucontext_t *@var{ucp})
@safety{@prelim{}@mtsafe{@mtsrace{:ucp}}@assafe{}@acsafe{}}
@c Linux-only implementations in assembly, including sigprocmask
@c syscall.  A few cases call the sigprocmask function, but that's safe
@c too.  The ppc case is implemented in terms of a swapcontext syscall.
The @code{getcontext} function initializes the variable pointed to by
@var{ucp} with the context of the calling thread.  The context contains
the content of the registers, the signal mask, and the current stack.
Executing the contents would start at the point where the
@code{getcontext} call just returned.

The function returns @code{0} if successful.  Otherwise it returns
@code{-1} and sets @var{errno} accordingly.
@end deftypefun

The @code{getcontext} function is similar to @code{setjmp} but it does
not provide an indication of whether the function returns for the first
time or whether the initialized context was used and the execution is
resumed at just that point.  If this is necessary the user has to take
determine this herself.  This must be done carefully since the context
contains registers which might contain register variables.  This is a
good situation to define variables with @code{volatile}.

Once the context variable is initialized it can be used as is or it can
be modified.  The latter is normally done to implement co-routines or
similar constructs.  The @code{makecontext} function is what has to be
used to do that.

@comment ucontext.h
@comment SVID
@deftypefun void makecontext (ucontext_t *@var{ucp}, void (*@var{func}) (void), int @var{argc}, @dots{})
@safety{@prelim{}@mtsafe{@mtsrace{:ucp}}@assafe{}@acsafe{}}
@c Linux-only implementations mostly in assembly, nothing unsafe.

The @var{ucp} parameter passed to the @code{makecontext} shall be
initialized by a call to @code{getcontext}.  The context will be
modified to in a way so that if the context is resumed it will start by
calling the function @code{func} which gets @var{argc} integer arguments
passed.  The integer arguments which are to be passed should follow the
@var{argc} parameter in the call to @code{makecontext}.

Before the call to this function the @code{uc_stack} and @code{uc_link}
element of the @var{ucp} structure should be initialized.  The
@code{uc_stack} element describes the stack which is used for this
context.  No two contexts which are used at the same time should use the
same memory region for a stack.

The @code{uc_link} element of the object pointed to by @var{ucp} should
be a pointer to the context to be executed when the function @var{func}
returns or it should be a null pointer.  See @code{setcontext} for more
information about the exact use.
@end deftypefun

While allocating the memory for the stack one has to be careful.  Most
modern processors keep track of whether a certain memory region is
allowed to contain code which is executed or not.  Data segments and
heap memory is normally not tagged to allow this.  The result is that
programs would fail.  Examples for such code include the calling
sequences the GNU C compiler generates for calls to nested functions.
Safe ways to allocate stacks correctly include using memory on the
original threads stack or explicitly allocate memory tagged for
execution using (@pxref{Memory-mapped I/O}).

@strong{Compatibility note}: The current Unix standard is very imprecise
about the way the stack is allocated.  All implementations seem to agree
that the @code{uc_stack} element must be used but the values stored in
the elements of the @code{stack_t} value are unclear.  @Theglibc{}
and most other Unix implementations require the @code{ss_sp} value of
the @code{uc_stack} element to point to the base of the memory region
allocated for the stack and the size of the memory region is stored in
@code{ss_size}.  There are implements out there which require
@code{ss_sp} to be set to the value the stack pointer will have (which
can depending on the direction the stack grows be different).  This
difference makes the @code{makecontext} function hard to use and it
requires detection of the platform at compile time.

@comment ucontext.h
@comment SVID
@deftypefun int setcontext (const ucontext_t *@var{ucp})
@safety{@prelim{}@mtsafe{@mtsrace{:ucp}}@asunsafe{@asucorrupt{}}@acunsafe{@acucorrupt{}}}
@c Linux-only implementations mostly in assembly.  Some ports use
@c sigreturn or swapcontext syscalls; others restore the signal mask
@c first and then proceed restore other registers in userland, which
@c leaves a window for cancellation or async signals with misaligned or
@c otherwise corrupt stack.  ??? Switching to a different stack, or even
@c to an earlier state on the same stack, may conflict with pthread
@c cleanups.  This is not quite MT-Unsafe, it's a different kind of
@c safety issue.

The @code{setcontext} function restores the context described by
@var{ucp}.  The context is not modified and can be reused as often as
wanted.

If the context was created by @code{getcontext} execution resumes with
the registers filled with the same values and the same stack as if the
@code{getcontext} call just returned.

If the context was modified with a call to @code{makecontext} execution
continues with the function passed to @code{makecontext} which gets the
specified parameters passed.  If this function returns execution is
resumed in the context which was referenced by the @code{uc_link}
element of the context structure passed to @code{makecontext} at the
time of the call.  If @code{uc_link} was a null pointer the application
terminates normally with an exit status value of @code{EXIT_SUCCESS}
(@pxref{Program Termination}).

Since the context contains information about the stack no two threads
should use the same context at the same time.  The result in most cases
would be disastrous.

The @code{setcontext} function does not return unless an error occurred
in which case it returns @code{-1}.
@end deftypefun

The @code{setcontext} function simply replaces the current context with
the one described by the @var{ucp} parameter.  This is often useful but
there are situations where the current context has to be preserved.

@comment ucontext.h
@comment SVID
@deftypefun int swapcontext (ucontext_t *restrict @var{oucp}, const ucontext_t *restrict @var{ucp})
@safety{@prelim{}@mtsafe{@mtsrace{:oucp} @mtsrace{:ucp}}@asunsafe{@asucorrupt{}}@acunsafe{@acucorrupt{}}}
@c Linux-only implementations mostly in assembly.  Some ports call or
@c inline getcontext and/or setcontext, adjusting the saved context in
@c between, so we inherit the potential issues of both.

The @code{swapcontext} function is similar to @code{setcontext} but
instead of just replacing the current context the latter is first saved
in the object pointed to by @var{oucp} as if this was a call to
@code{getcontext}.  The saved context would resume after the call to
@code{swapcontext}.

Once the current context is saved the context described in @var{ucp} is
installed and execution continues as described in this context.

If @code{swapcontext} succeeds the function does not return unless the
context @var{oucp} is used without prior modification by
@code{makecontext}.  The return value in this case is @code{0}.  If the
function fails it returns @code{-1} and set @var{errno} accordingly.
@end deftypefun

@heading Example for SVID Context Handling

The easiest way to use the context handling functions is as a
replacement for @code{setjmp} and @code{longjmp}.  The context contains
on most platforms more information which might lead to less surprises
but this also means using these functions is more expensive (beside
being less portable).

@smallexample
int
random_search (int n, int (*fp) (int, ucontext_t *))
@{
  volatile int cnt = 0;
  ucontext_t uc;

  /* @r{Safe current context.}  */
  if (getcontext (&uc) < 0)
    return -1;

  /* @r{If we have not tried @var{n} times try again.}  */
  if (cnt++ < n)
    /* @r{Call the function with a new random number}
       @r{and the context}.  */
    if (fp (rand (), &uc) != 0)
      /* @r{We found what we were looking for.}  */
      return 1;

  /* @r{Not found.}  */
  return 0;
@}
@end smallexample

Using contexts in such a way enables emulating exception handling.  The
search functions passed in the @var{fp} parameter could be very large,
nested, and complex which would make it complicated (or at least would
require a lot of code) to leave the function with an error value which
has to be passed down to the caller.  By using the context it is
possible to leave the search function in one step and allow restarting
the search which also has the nice side effect that it can be
significantly faster.

Something which is harder to implement with @code{setjmp} and
@code{longjmp} is to switch temporarily to a different execution path
and then resume where execution was stopped.

@smallexample
@include swapcontext.c.texi
@end smallexample

This an example how the context functions can be used to implement
co-routines or cooperative multi-threading.  All that has to be done is
to call every once in a while @code{swapcontext} to continue running a
different context.  It is not allowed to do the context switching from
the signal handler directly since neither @code{setcontext} nor
@code{swapcontext} are functions which can be called from a signal
handler.  But setting a variable in the signal handler and checking it
in the body of the functions which are executed.  Since
@code{swapcontext} is saving the current context it is possible to have
multiple different scheduling points in the code.  Execution will always
resume where it was left.