So You Want To Write A Linux Userland

The target audience of this document are programmers with some experience
writing userland Linux programs in C or C++ who want to understand how userland
works from the bottom up.

** The init Process
The Linux kernel imposes very few requirements on its userland, which gives you,
the userland author, tremendous design flexibility. There is only one thing we
need to do: provide an initial binary to launch at any of these paths:
/sbin/init, /etc/init, /bin/init, or /bin/sh. This process (init) is run as pid
1.

The init process needs to do two things:
* Never exit (if init exits, the kernel panics)
* Reap zombie processes.

When a process exits, it becomes a "zombie" process (in state Z). A zombie
process has all of its resources deallocated, but still occupies a slot in the
process table until its parent can collect its exit status via wait(2). What if
a process dies while it has children, though? The child processes would never
have their exit statuses collected, and the process table would be overrun by
zombie processes. The Linux kernel (and other Unix-like kernels) therefore
reparent any process whose parent dies to init, and init is responsible for
reaping those processes via wait(2).

The simplest possible init is therefore this C program:

int main() {
	int status;
	while (1)
		wait(&status);
}

We can build and run a Linux system with that init, and everything will "work",
but the system won't be very useful - init will sit in an endless loop of
failing wait() calls, since it has no children.

We can make our init useful like this:

int main() {
	int status;
	const char *shell[] = { "/bin/sh", NULL };
	if (!fork())
		execve(shell[0], shell, NULL);
	while (1)
		wait(&status);
}

Now our init forks a subprocess to run a shell before entering its infinite
loop, so it has a child process which can in turn create more child processes.
In fact, for a single-user command-line system, this init is almost everything
we need! There's one small problem: this init has a very "single session per
boot" model, where if the first shell exits, the system becomes useless. We can
pretty easily fix that by having init check if the exiting process is the shell
it launched, and if so, relaunching the shell.

** Daemons and the Supervisor

Often, however, we'll want to do some work in the background that isn't just
running a shell. For example, in a graphical system, we might need to start X;
we might also need to start network daemons and other local services. We can
have our init fork off and launch all of those processes itself, too, and
restart them if they exit, but this is starting to be a lot of logic to put into
a process that previously had so few responsibilities.

A common solution here is to have a "supervisor" process, which init is
responsible for running and which is responsible for starting and watching the
state of system services. Examples of supervisor processes in common use right
now are upstart and systemd. The supervisor process usually exposes an interface
over some form of IPC as well, so that a user of the system can control the
processes it manages. In this design, the supervisor process has three
responsibilities:

* Launch system services
* Relaunch any system service that exits
* Allow an administrator to manually stop and start system services

There are a lot of possible designs for the supervisor process, but the things
it has to do are pretty clear.

So right now, we have our init service doing three things as well:

* Launch the supervisor process, and restart it if it exits
* Launch a shell, and restart it if it exits
* Reap zombies

** Login

Sometimes we don't actually want to run a shell as root on the system at every
boot, and in any case we can move that capability out of init and into the
supervisor process. We can treat the ability to log in as a service (or have one
service per method of logging in, rather) - so just like sshd is a service,
"local login" is a service. On most Unix systems, this service is actually named
"login", and one copy of it is run for each virtual TTY the system has. The
login service is quite simple; all it needs to do is accept a username and
password from the console, and if that username/password pair is valid, change
user IDs (with setresuid(2)) to that user's user ID and launch their shell of
choice. The oldest design of the login program involved a file called
/etc/passwd, containing all of these values (username, hash of password, user
ID, and desired shell); newer designs have many more fancy features, like the
ability to authenticate against a network service. Still, the guts of login are
quite simple:

* Get username/password
* Look up the username in the user database
* Hash the password, and check if it matches the hash in the database
* If no, print an error message
* If yes, setresuid() to the user's uid, then execve their shell.

Note that because it uses setresuid() and needs access to the password file,
this design of login inherently requires root access.

** The Shell

At a minimum, a workable shell should provide the ability to execute commands as
subprocesses with arguments supplied to them. Other common shell features
include:

* Pipes
* Input and output redirection
* Background processes
* Aliases and functions
* Builtin logic and arithmetic operators

The shell is the user's main interface with a command-line system, so the usual
admonition to "do one thing and do it well" is often bent or broken in designing
shells; small additional features can be big improvements in the lives of users.

** The C Library

The last component we're likely to really need for a working system is a C
library, commonly known as libc. This function provides C function declarations
(and implementations of them) to let C programs interface with the system. At a
bare minimum, the C library needs to contain program startup code, which
arranges the stack and execution environment to conform to what C expects; see
the ABI documentation for your system for more details about that. The program
startup code is commonly written per-platform in assembly language.

Another very common feature of C libraries is C wrappers for useful system
calls. Widely-used libcs also often contain:

* Definitions of standard types (size_t, uint32_t, and so on)
* A memory allocator (malloc(3))
* String manipulation functions (strlen(3) and friends)
* Convenience functions (system(3) and similar)
* System service APIs (getpwent(3) and similar)

And so on. We can put almost anything in our C library, but bear in mind that
anything included in it should be truly useful in all or almost all C programs
in use on our system, since the C library imposes a size and memory use penalty
on every single C program we build.

** Putting It Together

So here's the list of parts we need to build a working, useful Unix system:

* init
* a supervisor
* a login program
* a shell
* a C library

All of these are actually quite tractable to write, and none of them exceeds a
couple hundred lines of code in their most minimal form. On top of this base we
can build just about anything we want. Examples of other programs we might want:

* Tools for working with the filesystem (ls, mv, cp, etc)
* An editor
* An assembler, so we can start developing on our system
* A remote-login system that listens for network connections and invokes login

Go forth and Unix! :)

$#t Writing a Unix Userland
$#s What the main components of a rudimentary Unix userland are and how
$#s to write them in C.
$#o unix, programming, c
$#u e6cb57aa-5e41-4ce6-8ab5-8a61206ae6ae