# Struct Member Naming History

If you've read the man page for stat(2), you've probably noticed something a
a bit odd about the fields of struct stat: they are all prefixed with st_,
so the struct looks like this:

        struct stat {
                dev_t     st_dev;
                ino_t     st_ino;
                mode_t    st_mode;
                /* ... and so on */
        };

When I first saw this, I assumed it was there as a sort of hungarian notation
like reminder of the type of struct you're working with - seeing something like

        s->st_mode

instead of just

        s->mode

would remind the reader that s is a struct stat. That seems a little thin, but
at the same time Unix coding style is generally pretty terse and so it seems
weird to just have a redundant prefix on every member name. Still, it keeps
cropping up:

        struct timeval {
                time_t tv_sec;
                suseconds_t tv_usec;
        }

        struct addrinfo {
                int ai_flags;
                int ai_family;
                /* ... */
        }

What gives? I think the actual answer has to do with a bit of old C history.

## Archaic Structs

Modern C structs define types, and within each type there is a separate
namespace of struct members, which means one can do this:

        struct foo {
                int id;
                /* ... other stuff ... */
        }

        struct bar {
                int id;
                /* ... other stuff ... */
        }

For any given expression, the compiler can know what x->id means, even though
there are multiple possible definitions of id, using the type of x. However, old
C did not work that way - pointer types would more freely convert as needed. I
started wondering whether pointers to non-struct types would convert to struct
pointers if they had to.

Look at this example code from Unix Version 3, /ken/alloc.c:

        int *cp, *bp;
        int i;

        bp = bread(ROOTDEV, 1);
        cp = getblk(NODEV);
        if(u.u_error)
                panic("iinit");
        for(i=0; i<512; i++)
                cp->b_addr[i] = bp->b_addr[i];

This looks like nonsense - we're using int pointers as though they were struct
pointers. For that to work, they'd have to convert to struct pointers instead,

Another way of interpreting this, if you like, is that current C compilers keep
one table per struct type, which contains the offset and type of each member. It
seems like early C compilers behaved as though there was one global struct
member table instead, and the -> operator simply always used that table without
caring what the type of its left hand side was. It's slick, and easy to
implement in a compiler with no type information at all about values (!), which
is something that early C versions only partially had.

There's one other weird clue... in the Unix Version 5 compiler source, there are
a handful of structs that are defined anonymously in headers, like this (from
/usr/c/c0h.c):

        struct {                        
                int     op;             
                int     type;           
                char    ssp;            /* subscript list */
                char    lenp;           /* structure length */
        };

In modern C this is a no-op, since it defines an anonymous struct type that
then can't be referenced anywhere. However, in this version of C, you then
see code just do stuff like:

        if ((cs->htype&030) == ARRAY) {                         
                cs->htype =- 020;       /* set ptr */           
                cs->ssp++;              /* pop dims */          
        }

Even though ssp is not actually part of any other struct, and cs is a char*!
That makes me lean toward the interpretation where struct member definitions
were actually just filling out a separate namespace that existed only on the
right hand side of the -> operator. In fact, it seems in this version of the
compiler like the . operator is a synonym for ->, since it is used identically
with pointer types on the left hand side.

Excitingly, the code for the V5 C compiler backs this interpretation up! The V5
compiler used a single word for types, so something was either "a struct"
or "not a struct", and the compiler actually eliminates structs during parsing:
-> turns into . early on, and a.b turns into a "member of struct" reference...
but since a has no type information at this point, there is only a single shared
symbol table used to look up struct members. Actually, I'm pretty sure it is
*the same* symbol table that is used for all other symbols. Wild!

So basically, the kinda hacky way structs are implemented in early C leads to
the necessity for struct field names to be globally unique, which leads to
the weird vestigal prefixes we see on a lot of struct field names in POSIX APIs
today. There you go :)

Anyway, that's it for now - thanks for reading! I found researching this post
extremely interesting and there are a couple of other aspects of early C I'd
like to write about as well, like the old function argument syntax and the
old declaration syntax, which did not use = and so one simply wrote:

  int a 1;

That's for the future, though. Bye!

## Addendum: Sources

Major thanks to The Unix Heritage Society; this post references in particular
the version 3 "nsys" tarball:

=> https://www.tuhs.org/Archive/Distributions/Research/Dennis_v3/nsys.tar.gz

and the version 5 C compiler sources:

=> https://www.tuhs.org/Archive/Distributions/Research/Dennis_v5/v5root.tar.gz

$#t Struct Member Naming History
$#s Why do some libc structs have prefixes on all their names?
$#o c, history, unix
$#u 7573b7e2-a564-4301-8e25-fd261a5421a6