about summary refs log tree commit diff
path: root/doc/libstddjb/selfpipe.html
blob: 54c0f360edb933dd936255da718f6e59f448edaa (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
<html>
  <head>
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <meta http-equiv="Content-Language" content="en" />
    <title>skalibs: the selfpipe library interface</title>
    <meta name="Description" content="skalibs: the selfpipe library interface" />
    <meta name="Keywords" content="skalibs stddjb libstddjb selfpipe self-pipe library interface" />
    <!-- <link rel="stylesheet" type="text/css" href="//skarnet.org/default.css" /> -->
  </head>
<body>

<p>
<a href="index.html">libstddjb</a><br />
<a href="../libskarnet.html">libskarnet</a><br />
<a href="../index.html">skalibs</a><br />
<a href="//skarnet.org/software/">Software</a><br />
<a href="//skarnet.org/">skarnet.org</a>
</p>

<h1> The <tt>selfpipe</tt> library interface </h1>

<p>
 The selfpipe functions are declared in the
<tt>skalibs/selfpipe.h</tt> header and implemented in the <tt>libskarnet.a</tt>
or <tt>libskarnet.so</tt> library.
</p>

<h2> What does it do&nbsp;? </h2>

<p>
Signal handlers suck.
</p>

<p>
They do. I don't care how experienced you are with C/Unix programming,
they do. You can be Ken Thompson, if you use signal handlers as a
regular part of your C programming model, you <em>are</em> going to
screw up, and write buggy code.
</p>

<p>
 Unix is tricky enough with interruptions. Even when you have a single
thread, signals can make the execution flow very non-intuitive.
They mess up the logic of linear and structured code,
they introduce non-determinism; you always have to think "and what
if I get interrupted here and the flow goes into a handler...". This
is annoying.
</p>

<p>
 Moreover, signal handler code is <em>very</em> limited in what it can
do. It can't use any non-reentrant function! If you call a non-reentrant
function, and by chance you were precisely in that non-reentrant function
code when you got interrupted by a signal... you lose. That means, no
malloc(). No bufferized IO. No globals. The list goes on and on. <br />
 If you're going to catch signals, you'll want to handle them <em>outside</em>
the signal handler. You actually want to spend <em>the least possible
time</em> inside a signal handler - just enough to notify your main
execution flow that there's a signal to take care of.
</p>

<p>
 And, of course, signal handlers don't mix with event loops, which is
a classic source of headaches for programmers and led to the birth of
abominations such as
<a href="https://www.opengroup.org/onlinepubs/009695399/functions/pselect.html">
pselect</a>. So much for the "everything is a file" concept that Unix was
built on.
</p>

<p>
 A signal should be an event like any other.
There should be a unified interface - receiving a signal should make some
fd readable or something.
</p>

<p>
 And that's exactly what the
<a href="https://cr.yp.to/docs/selfpipe.html">self-pipe trick</a>, invented
by <a href="../djblegacy.html">DJB</a>, does.
</p>

<p>
 As long as you're in some kind of event loop, the self-pipe trick allows
you to forget about signal handlers... <em>forever</em>. It works this way:
</p>

<ol>
 <li> Create a pipe <tt>p</tt>. Make both ends close-on-exec and nonblocking. </li>
 <li> Write a tiny signal handler ("top half") for all the signals you want to
catch. This
signal handler should just write one byte into <tt>p[1]</tt>, and do nothing
more; ideally, the written byte identifies the signal. </li>
 <li> In your event loop, add <tt>p[0]</tt> to the list of fds you're watching
for readability. </li>
</ol>

<p>
 When you get a signal, a byte will be written to the self-pipe, and your
execution flow will resume. When you next go through the event loop,
<tt>p[0]</tt> will be readable; you'll then be able to read a byte from
it, identify the signal, and handle it - in your unrestricted main
environment (the "bottom half" of the handler).
</p>

<p>
 The selfpipe library does it all for you - you don't even have to write
the top half yourself. You can forget their existence and recover
some peace of mind.
</p>

<p>
 Note that in an asynchronous event loop, you need to protect your
system calls against EINTR by using <a href="safewrappers.html">safe
wrappers</a>.
</p>

<h2> How do I use it&nbsp;? </h2>

<h3> Starting </h3>

<pre>
int fd = selfpipe_init() ;
</pre>

<p>
<tt>selfpipe_init()</tt> sets up a selfpipe. You must use that
function first. <br />
If <tt>fd</tt> is -1, then an error occurred. Else <tt>fd</tt> is a
non-blocking descriptor that can be used in your event loop. It will
be selected for readability when you've caught a signal.
</p>

<h3> Trapping signals </h3>

<pre>
int r = selfpipe_trap(SIGTERM) ;
</pre>

<p>
<tt>selfpipe_trap()</tt> catches a signal and sends it to the selfpipe.
Uncaught signals won't trigger the selfpipe. <tt>r</tt> is 1 if
the operation succeeded, and 0 if it failed. If it succeeded, you
can forget about the trapped signal entirely. <br />
In our example, if <tt>r</tt> is 1, then a SIGTERM will instantly
trigger readability on <tt>fd</tt>.
</p>

<pre>
int r ;
sigset_t set ;
sigemptyset(&amp;set) ;
sigaddset(&amp;set, SIGTERM) ;
sigaddset(&amp;set, SIGHUP) ;
r = selfpipe_trapset(&amp;set) ;
</pre>

<p>
<tt>selfpipe_trap()</tt> handles signals one
by one. Alternatively (and often preferrably), you can use
<tt>selfpipe_trapset()</tt> to directly handle signal sets. When you call
<tt>selfpipe_trapset()</tt>, signals that are present in <tt>set</tt> will
be caught by the selfpipe, and signals that are absent from <tt>set</tt>
will be uncaught. <tt>r</tt> is 1 if the operation succeeded and 0 if it
failed.
</p>

<h3> Handling events </h3>

<pre>
int c = selfpipe_read() ;
</pre>

<p>
 Call <tt>selfpipe_read()</tt> when your <tt>fd</tt> is readable.
That's where you write your <em>real</em> signal handler: in the
body of your event loop, in a "normal" context. <br />
<tt>c</tt> is -1 if an error occurred - in which case chances are
it's a serious one and your system has become very unstable.
<tt>c</tt> is 0 if there are no more pending signals. If <tt>c</tt>
is positive, it is the number of the signal that was caught.
</p>

<h3> Accessing the selfpipe </h3>

<pre>
int fd = selfpipe_fd() ;
</pre>

<p>
 Sometimes you need to access the fd of the selfpipe in two
very distinct translation units (typically to poll on it), and you
rightly don't want to add a global variable to store it, especially
since it's already stored in a global internal variable in skalibs.
No need to bloat your binary anymore: <tt>selfpipe_fd()</tt> will
now retrieve the value for you, wherever you are.
</p>

<h3> Finishing </h3>

<pre>
selfpipe_finish() ;
</pre>

<p>
 Call <tt>selfpipe_finish()</tt> when you're done using the selfpipe.
Signal handlers will be restored to SIG_DFL, i.e. signals will not
be trapped anymore.
</p>

<h2> Any limitations&nbsp;? </h2>

<p>
 Some, as always.
</p>

<ul>
 <li> The selfpipe library uses a global pipe;
so, it's theoretically not safe for multithreading. However, as long as you dedicate
one thread to signal handling and block signals in all the other threads
(see <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_sigmask.html">pthread_sigmask()</a>)
then you should be able to use the selfpipe in the thread that handles
signals without trouble. Since reading the selfpipe involves waiting for
a file descriptor to become readable, it is recommended to do this in a
thread that will already have a regular input/output loop (via
<a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/poll.html">poll()</a>
or <a href="iopause.html">iopause()</a>) so you can just add the selfpipe
to the list of fds you're reading on. </li>
 <li> In rare cases, the self-pipe can theoretically be filled, if some
application sends more than PIPE_BUF signals before you have time to
<tt>selfpipe_read()</tt>. On most Unix systems, PIPE_BUF is 4096,
so it's a very acceptable margin. Unless your code is waiting where
it should not be, only malicious applications will fill the self-pipe
- and malicious applications could just send you a SIGKILL and be done
with you, so this is not a concern. Protect yourself from malicious
applications with clever use of uids. </li>
</ul>

<h2> Hey, Linux has <a href="https://man7.org/linux/man-pages/man2/signalfd.2.html">signalfd()</a> for this&nbsp;! </h2>

<p>
 Yes, the Linux team loves to gratuitously add new system calls to do
things that could already be done before without much effort. This
adds API complexity, which is not a sign of good engineering.
</p>

<p>
 However, now that <tt>signalfd()</tt> exists, it is indeed marginally more
efficient than a pipe, and it saves one fd: so the selfpipe library
is implemented via <tt>signalfd()</tt> when this call is available.
</p>

</body>
</html>