Open jobs for finishing GNU libc:
---------------------------------
Status: December 1998
If you have time and talent to take over any of the jobs below please
contact <bug-glibc@gnu.org>.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[ 1] Port to new platforms or test current version on formerly supported
platforms.
**** See http://www.gnu.org/software/libc/porting.html for more details.
[ 2] Test compliance with standards. If you have access to recent
standards (IEEE, ISO, ANSI, X/Open, ...) and/or test suites you
could do some checks as the goal is to be compliant with all
standards if they do not contradict each other.
[ 3] The IMHO opinion most important task is to write a more complete
test suite. We cannot get too many people working on this. It is
not difficult to write a test, find a definition of the function
which I normally can provide, if necessary, and start writing tests
to test for compliance. Beside this, take a look at the sources
and write tests which in total test as many paths of execution as
possible.
[ 4] Write translations for the GNU libc message for the so far
unsupported languages. GNU libc is fully internationalized and
users can immediately benefit from this.
Take a look at the matrix in
ftp://ftp.gnu.org/pub/gnu/ABOUT-NLS
for the current status (of course better use a mirror of ftp.gnu.org).
[ 6] Write `long double' versions of the math functions. This should be
done in collaboration with the NetBSD and FreeBSD people.
The libm is in fact fdlibm (not the same as in Linux libc 5).
**** Partly done. But we need someone with numerical experiences for
the rest.
[ 7] Several math functions have to be written:
- exp2
with long double arguments.
Beside this most of the complex math functions which are new in
ISO C 9X should be improved. Writing some of them in assembler is
useful to exploit the parallelism which often is available.
[ 8] If you enjoy assembler programming (as I do --drepper :-) you might
be interested in writing optimized versions for some functions.
Especially the string handling functions can be optimized a lot.
Take a look at
Faster String Functions
Henry Spencer, University of Toronto
Usenix Winter '92, pp. 419--428
or just ask. Currently mostly i?86 and Alpha optimized versions
exist. Please ask before working on this to avoid duplicate
work.
[10] Extend regex and/or rx to work with wide characters and complete
implementation of character class and collation class handling.
It is planned to do a complete rewrite.
[11] Write access function for netmasks, bootparams, and automount
databases for nss_files and nss_db module.
The functions should be embedded in the nss scheme. This is not
hard and not all services must be supported at once.
[14] We need to write a library for on-the-fly transformation of streams
of text. In fact, this would be a recode-library (you know, GNU recode).
This is needed in several places in the GNU libc and I already have
rather concrete plans but so far no possibility to start this.
*** The library is available, now it remains to be used in the streams.
[15] Cleaning up the header files. Ideally, each header style should
follow the "good examples". Each variable and function should have
a short description of the function and its parameters. The prototypes
should always contain variable names which can help to identify their
meaning; better than
int foo __P ((int, int, int, int));
Blargh!
[16] The libio stream file functions should be extended in a way to use
mmap to map the file and use it as the buffer to user sees. For
read-only streams this should be rather easy and it avoids all read()
calls.
A more sophisticated solution would use mmap also for writing. The
standards do not demand that the file on the disk is always in the
correct form so it would be possible to enlarge it always according
to the page size and install the correct length only for fclose() and
fflush() calls.
[18] Based on the sprof program we need tools to analyze the output. The
result should be a link map which specifies in which order the .o
files are placed in the shared object. This should help to improve
code locality and result in a smaller foorprint (in code and data
memory) since less pages are only used in small parts.
[19] A user-level STREAMS implementation should be available if the
kernel does not provide the support.
[20] More conversion modules for iconv(3). Existing modules should be
extended to do things like transliteration if this is wanted.
For often used conversion a direct conversion function should be
available.
[21] The nscd program and the stubs in the libc should be changed so
that each program uses only one socket connect. Take a look at
http://www.cygnus.com/~drepper/nscd.html
An alternative approach is to use an mmap()ed file. The idea is
the following:
- the nscd creates the hash tables and the information it stores
in it in a mmap()ed region. This means no pointers must be
used, only offsets.
- each program using NSS functionality tries to open the file
with the data.
- by checking some timestamp (which the nscd renew frequently)
the programs can test whether the file is still valid
- if the file is valid look through the nscd and locate the
appropriate hash table for the database and lookup the data.
If it is included we are set.
- if the data is not yet in the database we contact the nscd using
the currently implemented methods.
[22] It should be possible to have the information gconv-modules in
a simple database which is faster to access. Using libdb is probably
overkill and loading it would probably be slower than reading the
plain text file. But a file format with a simple hash table and
some data it points to should be fine. Probably it should be
two tables, one for the aliases, one for the mappings. The code
should start similar to this:
if (stat ("gconv-modules", &stp) == 0
&& stat ("gconv-modules.db", &std) == 0
&& stp.st_mtime < std.st_mtime)
{
... use the database ...
{
else
{
... use the plain file if it exists, otherwise the db ...
}
[23] The `strptime' function needs to be completed. This includes among
other things that it must get teached about timezones. The solution
envisioned is to extract the timezones from the ADO timezone
specifications. Special care must be given names which are used
multiple times. Here the precedence should (probably) be according
to the geograhical distance. E.g., the timezone EST should be
treated as the `Eastern Australia Time' instead of the US `Eastern
Standard Time' if the current TZ variable is set to, say,
Australia/Canberra or if the current locale is en_AU.
[25] Sun's nscd version implements a feature where the nscd keeps N entries
for each database current. I.e., if an entries lifespan is over and
it is one of the N entries to be kept the nscd updates the information
instead of removing the entry.
How to decide about which N entries to keep has to be examined.
Factors should be number of uses (of course), influenced by aging.
Just imagine a computer used by several people. The IDs of the current
user should be preferred even if the last user spent more time.
[26] Improve the AIO implementation so that threads do not immediately
terminate if no more requests are available. Let them sleep for a
while and wake them up on demand. If after a while no request arrived
they really can die.