Home > Blog > The Most Elegant Macro

The Most Elegant Macro

by Phillip Trudeau-Tavara on March 27th, 2019

Spoiler alert: The juicy bits of this post lie in the final code snippet. :)

When new code introduces minimal complexity, you ease engineering for everyone. That's the value we strive for in clean code, but it's elusive – especially when that code is a C preprocessor macro.

Enumeration Station

Say you want to pretty-print a C or C++ enumeration...

enum Foo { First, Second, Third, };

Your options are twofold:

const char *Foo_names[] = { "First", "Second", "Third", };

// ...or:

const char *Foo_to_string(Foo f) {
    switch (f) {
    case First: return "First";
    case Second: return "Second";
    case Third: return "Third";
    }
}

These are certainly workable solutions. Really, this post could end right here. But maintaining multiple correlated lists can go bad. When building, say, an OpenGL procedure loader – where GL functions must all be declared, defined, and loaded – any solution to this is indispensable. Thus, for the sake of education, let's say you despise this enum redundancy. Not to fear! The C++ community to the rescue:

// ... in Foo.h:
ENUM_VAL(First)
ENUM_VAL(Second)
ENUM_VAL(Third)

// ... in Code.cpp:
enum Foo {
    #define ENUM_VAL(Name) Name,
    #include "Foo.h"
    #undef ENUM_VAL
};
const char *Foo_names[] = {
    #define ENUM_VAL(Name) # Name,
    #include "Foo.h"
    #undef ENUM_VAL
};

... Hopefully you cringed at this solution. The complexity here is sky-high, although it technically works. The Foo.h file lists each enum value surrounded by ENUM_VAL(), which is not defined. Later, code that wants to operate on all of Foo's values defines ENUM_VAL appropriately, and then includes the file. ENUM_VAL is then un-defined and the process repeats. After preprocessing, the source code would read:

enum Foo {
    
    First,
    Second,
    Third,
    
};
const char *Foo_names[] = {
    
    "First",
    "Second",
    "Third",
    
};

They call this general pattern the "X macro" – so called since this example's ENUM_VAL is typically shortened to simply X (it can be any available name). It's quite useful when some items must be listed in 2 or more complex variations, since you can list the items once but change which macro the list expands.

Amelioration Station (... i.e., improving it)

We can simplify this first solution by eliminating the extra file:

#define FOO_ENUM    ENUM_VAL(First) ENUM_VAL(Second) ENUM_VAL(Third)

// ...

enum Foo {
    #define ENUM_VAL(Name) Name,
    FOO_ENUM
    #undef ENUM_VAL
};
const char *Foo_names[] = {
    #define ENUM_VAL(Name) # Name,
    FOO_ENUM
    #undef ENUM_VAL
};

Here, Foo.h simply becomes the macro FOO_ENUM. That difference is non-trivial, especially when our goal is minimal added complexity.

That is actually the most common version of this pattern I see used out-in-the-wild. As mentioned before, one use case I've observed is custom OpenGL loaders:

// ... in the header:

#define GL_FUNCTION_LIST \
    GL_ITEM(void, glEnable, GLenum cap) \
    GL_ITEM(void, glDisable, GLenum cap) \
    GL_ITEM(void, glBegin, GLenum mode) \
    GL_ITEM(void, glEnd, void) \
    // etc...


#define GL_ITEM(ReturnType, Name, Parameter) \
    typedef ReturnType Name ## _func (Parameter); \
    extern Name ## _func *Name;
    
GL_FUNCTION_LIST // Typedef all function types and declare them as extern pointers

#undef GL_ITEM


// ... in the implementation:

#define GL_ITEM(ReturnType, Name, Parameter) Name ## _func *Name; // (Parameter is ignored)

GL_FUNCTION_LIST // Define all function pointers

#undef GL_ITEM

(this is a real example.)

But C++ guru and D co-inventor Andrei Alexandrescu suggests a cleaner, more functionally-oriented variant:

#define FOO_ENUM(X)     X(First) X(Second) X(Third)

#define ENUM_MEMBER(Name) Name,
#define ENUM_STRING(Name) # Name,

// ...

enum Foo {
    FOO_ENUM(ENUM_MEMBER)
};
const char *Foo_names[] = {
    FOO_ENUM(ENUM_STRING)
};
The key difference is "passing the macro" X as a parameter into FOO_ENUM instead of redefining ENUM_VAL. Macros are like first-class functions to the preprocessor, so this Just Works™ – we need not redefine anything! It's simpler than #define / #undef, and the farthest I've seen anyone go.

The real Reese's moment

We've covered all these variations of the X macro, and each one seems cleaner than the last.

But sadly I've never seen anyone combine the peanut butter with the chocolate, so to speak. Nobody has committed to creating the best version they can make.

So, in the spirit of minimizing added complexity (...hopefully),

and bringing everything together,

I offer to you the following abstraction:

// In the header:

#define EnumMember(Name) Name,
#define EnumString(Name) # Name,
#define EnumDef(EnumName) \
    enum EnumName { EnumName(EnumMember) }; \
    const char *EnumName ## _names[] = { EnumName(EnumString) };


// ... Sample usage:

#define Foo(X) X(First) X(Second) X(Third)
EnumDef(Foo);

Foo my_foo = Second;
printf("my_foo is %s.\n", Foo_names[(int)my_foo]);

No duplication. No redefinition. Just a little bit of magic.


Next post: HTTPS: the problems →

Return to blog | Home