So You Want To Write A Linux Userland The target audience of this document are programmers with some experience writing userland Linux programs in C or C++ who want to understand how userland works from the bottom up. ** The init Process The Linux kernel imposes very few requirements on its userland, which gives you, the userland author, tremendous design flexibility. There is only one thing we need to do: provide an initial binary to launch at any of these paths: /sbin/init, /etc/init, /bin/init, or /bin/sh. This process (init) is run as pid 1. The init process needs to do two things: * Never exit (if init exits, the kernel panics) * Reap zombie processes. When a process exits, it becomes a "zombie" process (in state Z). A zombie process has all of its resources deallocated, but still occupies a slot in the process table until its parent can collect its exit status via wait(2). What if a process dies while it has children, though? The child processes would never have their exit statuses collected, and the process table would be overrun by zombie processes. The Linux kernel (and other Unix-like kernels) therefore reparent any process whose parent dies to init, and init is responsible for reaping those processes via wait(2). The simplest possible init is therefore this C program: int main() { int status; while (1) wait(&status); } We can build and run a Linux system with that init, and everything will "work", but the system won't be very useful - init will sit in an endless loop of failing wait() calls, since it has no children. We can make our init useful like this: int main() { int status; const char *shell[] = { "/bin/sh", NULL }; if (!fork()) execve(shell[0], shell, NULL); while (1) wait(&status); } Now our init forks a subprocess to run a shell before entering its infinite loop, so it has a child process which can in turn create more child processes. In fact, for a single-user command-line system, this init is almost everything we need! There's one small problem: this init has a very "single session per boot" model, where if the first shell exits, the system becomes useless. We can pretty easily fix that by having init check if the exiting process is the shell it launched, and if so, relaunching the shell. ** Daemons and the Supervisor Often, however, we'll want to do some work in the background that isn't just running a shell. For example, in a graphical system, we might need to start X; we might also need to start network daemons and other local services. We can have our init fork off and launch all of those processes itself, too, and restart them if they exit, but this is starting to be a lot of logic to put into a process that previously had so few responsibilities. A common solution here is to have a "supervisor" process, which init is responsible for running and which is responsible for starting and watching the state of system services. Examples of supervisor processes in common use right now are upstart and systemd. The supervisor process usually exposes an interface over some form of IPC as well, so that a user of the system can control the processes it manages. In this design, the supervisor process has three responsibilities: * Launch system services * Relaunch any system service that exits * Allow an administrator to manually stop and start system services There are a lot of possible designs for the supervisor process, but the things it has to do are pretty clear. So right now, we have our init service doing three things as well: * Launch the supervisor process, and restart it if it exits * Launch a shell, and restart it if it exits * Reap zombies ** Login Sometimes we don't actually want to run a shell as root on the system at every boot, and in any case we can move that capability out of init and into the supervisor process. We can treat the ability to log in as a service (or have one service per method of logging in, rather) - so just like sshd is a service, "local login" is a service. On most Unix systems, this service is actually named "login", and one copy of it is run for each virtual TTY the system has. The login service is quite simple; all it needs to do is accept a username and password from the console, and if that username/password pair is valid, change user IDs (with setresuid(2)) to that user's user ID and launch their shell of choice. The oldest design of the login program involved a file called /etc/passwd, containing all of these values (username, hash of password, user ID, and desired shell); newer designs have many more fancy features, like the ability to authenticate against a network service. Still, the guts of login are quite simple: * Get username/password * Look up the username in the user database * Hash the password, and check if it matches the hash in the database * If no, print an error message * If yes, setresuid() to the user's uid, then execve their shell. Note that because it uses setresuid() and needs access to the password file, this design of login inherently requires root access. ** The Shell At a minimum, a workable shell should provide the ability to execute commands as subprocesses with arguments supplied to them. Other common shell features include: * Pipes * Input and output redirection * Background processes * Aliases and functions * Builtin logic and arithmetic operators The shell is the user's main interface with a command-line system, so the usual admonition to "do one thing and do it well" is often bent or broken in designing shells; small additional features can be big improvements in the lives of users. ** The C Library The last component we're likely to really need for a working system is a C library, commonly known as libc. This function provides C function declarations (and implementations of them) to let C programs interface with the system. At a bare minimum, the C library needs to contain program startup code, which arranges the stack and execution environment to conform to what C expects; see the ABI documentation for your system for more details about that. The program startup code is commonly written per-platform in assembly language. Another very common feature of C libraries is C wrappers for useful system calls. Widely-used libcs also often contain: * Definitions of standard types (size_t, uint32_t, and so on) * A memory allocator (malloc(3)) * String manipulation functions (strlen(3) and friends) * Convenience functions (system(3) and similar) * System service APIs (getpwent(3) and similar) And so on. We can put almost anything in our C library, but bear in mind that anything included in it should be truly useful in all or almost all C programs in use on our system, since the C library imposes a size and memory use penalty on every single C program we build. ** Putting It Together So here's the list of parts we need to build a working, useful Unix system: * init * a supervisor * a login program * a shell * a C library All of these are actually quite tractable to write, and none of them exceeds a couple hundred lines of code in their most minimal form. On top of this base we can build just about anything we want. Examples of other programs we might want: * Tools for working with the filesystem (ls, mv, cp, etc) * An editor * An assembler, so we can start developing on our system * A remote-login system that listens for network connections and invokes login Go forth and Unix! :)