about summary refs log tree commit diff
path: root/doc/notifywhenup.html
blob: 6bc8b99e74578a0bb58c9f0a963d9080480306c9 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
<html>
  <head>
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <meta http-equiv="Content-Language" content="en" />
    <title>s6: service startup notifications</title>
    <meta name="Description" content="s6: service startup notifications" />
    <meta name="Keywords" content="s6 ftrig notification notifier writer libftrigw ftrigw startup U up svwait s6-svwait" />
    <!-- <link rel="stylesheet" type="text/css" href="http://skarnet.org/default.css" /> -->
  </head>
<body>

<p>
<a href="index.html">s6</a><br />
<a href="http://skarnet.org/software/">Software</a><br />
<a href="http://skarnet.org/">skarnet.org</a>
</p>

<h1> Service startup notifications </h1>

<p>
 It is easy for a process supervision suite to know when a service that was <em>up</em>
is now <em>down</em>: the long-lived process implementing the service is dead. The
supervisor, running as the daemon's parent, is instantly notified via a SIGCHLD.
When it happens, <a href="s6-supervise.html">s6-supervise</a> sends a 'd' event
to its <tt>./event</tt> <a href="fifodir.html">fifodir</a>, so every subscriber
knows that the service is down. All is well.
</p>

<p>
 It is much trickier for a process supervision suite to know when a service
that was <em>down</em> is now <em>up</em>. The supervisor forks and execs the
daemon, and knows when the exec has succeeded; but after that point, it's all
up to the daemon itself. Some daemons do a lot of initialization work before
they're actually ready to serve, and it is impossible for the supervisor to
know exactly <em>when</em> the service is really ready.
<a href="s6-supervise.html">s6-supervise</a> sends a 'u' event to its
<tt>./event</tt> <a href="fifodir.html">fifodir</a> when it successfully
spawns the daemon, but any subscriber
reacting to 'u' is subject to a race condition - the service provided by the
daemon may not be ready yet.
</p>

<p>
 Reliable startup notifications need support from the daemons themselves.
Daemons should do two things to signal the outside world that they are
ready:
</p>

<ol>
 <li> Update a state file, so other processes can get a snapshot
of the daemon's state </li>
 <li> Send an event to processes waiting for a state change. </li>
</ol>

<p>
 This is complex to implement in every single daemon, so s6 provides
tools to make it easier for daemon authors, without any need to link
against the s6 library or use any s6-specific construct:
 daemons can simply write a line to a file descriptor of their choice,
then close that file descriptor, when they're ready to serve. This is
a generic mechanism that some daemons already implement.
</p>

<p>
 s6 supports that mechanism natively: when the
<a href="servicedir.html">service directory</a> for the daemon contains
a valid <tt>notification-fd</tt> file, the daemon's supervisor, i.e. the
<a href="s6-supervise.html">s6-supervise</a> program, will properly catch
the daemon's message, update the status file (<tt>supervise/status</tt>),
then notify all the subscribers
with a <tt>'U'</tt> event, meaning that the service is now up and ready.
</p>

<p>
 This method should really be implemented in every long-running
program providing a service. When it is not the case, it's impossible
to provide reliable startup notifications, and subscribers should then
be content with the unreliable <tt>'u'</tt> events provided by s6-supervise.
</p>

</body>
</html>