04 September, 2018

C++ - Generating Lookup Tables at Compile-Time

There are many ways of creating lookup tables in C++. Let's assume that we have a set of keys which are compile-time constants and we want to associate some value to each. However, when accessing the value the key won't be a constant expression. Today's compilers are smart and can generate nice code that will do a binary search or can even create a lookup table, but you should never assume what they may do as each compiler has different heuristics and could take a different path to solve the problem.

What I will discuss today is creating arrays of values at compile-time that can be used anyhow you want as the format of the data will be defined by you. These arrays can be then used as lookup tables or regular arrays if that's what you need. So let's describe the problem first:

#include <stdint.h>
#include <stdlib.h>

enum TypeId : uint32_t {
  TID_NULL    = 0,
  TID_BOOLEAN = 1,
  TID_INT8    = 2,
  TID_INT16   = 3,
  TID_INT32   = 4,
  TID_INT64   = 5,
  TID_UINT8   = 6,
  TID_UINT16  = 7,
  TID_UINT32  = 8,
  TID_UINT64  = 9,
  TID_FLOAT32 = 10,
  TID_FLOAT64 = 11,
  TID_STRING  = 12,
  TID_ARRAY   = 13,
  TID_MAP     = 14,

  TID_COUNT   = 15
};

size_t objectSize(uint32_t tid) {
  return tid == TID_NULL    ?  0 :
         tid == TID_BOOLEAN ?  1 :
         tid == TID_INT8    ?  1 :
         tid == TID_INT16   ?  2 :
         tid == TID_INT32   ?  4 :
         tid == TID_INT64   ?  8 :
         tid == TID_UINT8   ?  1 :
         tid == TID_UINT16  ?  2 :
         tid == TID_UINT32  ?  4 :
         tid == TID_UINT64  ?  8 :
         tid == TID_FLOAT32 ?  4 :
         tid == TID_FLOAT64 ?  8 :
         tid == TID_STRING  ? 16 :
         tid == TID_ARRAY   ? 24 :
         tid == TID_MAP     ? 32 : SIZE_MAX;
}

[Compiler Explorer]

GCC and MSVC would take 'compare and return' approach while clang will use a single lookup table in our case. No approach is wrong in this case as compilers are free to do anything that would return a valid result.

Approach #1 - Naive LUT

We can use really naive approach to turn such code into a lookup table by creating an array and filling it:

#include <stdint.h>
#include <stdlib.h>

enum TypeId : uint32_t {
  TID_NULL    = 0,
  TID_BOOLEAN = 1,
  TID_INT8    = 2,
  TID_INT16   = 3,
  TID_INT32   = 4,
  TID_INT64   = 5,
  TID_UINT8   = 6,
  TID_UINT16  = 7,
  TID_UINT32  = 8,
  TID_UINT64  = 9,
  TID_FLOAT32 = 10,
  TID_FLOAT64 = 11,
  TID_STRING  = 12,
  TID_ARRAY   = 13,
  TID_MAP     = 14,

  TID_COUNT   = 15
};

static const uint8_t ObjectSizeArray[] = {
  /* TID_NULL    */  0 ,
  /* TID_BOOLEAN */  1 ,
  /* TID_INT8    */  1 ,
  /* TID_INT16   */  2 ,
  /* TID_INT32   */  4 ,
  /* TID_INT64   */  8 ,
  /* TID_UINT8   */  1 ,
  /* TID_UINT16  */  2 ,
  /* TID_UINT32  */  4 ,
  /* TID_UINT64  */  8 ,
  /* TID_FLOAT32 */  4 ,
  /* TID_FLOAT64 */  8 ,
  /* TID_STRING  */ 16 ,
  /* TID_ARRAY   */ 24 ,
  /* TID_MAP     */ 32
};

size_t objectSize(uint32_t tid) {
  if (tid >= TID_COUNT)
    return SIZE_MAX;
  else
    return ObjectSizeArray[tid];
}

[Compiler Explorer]

There is not much to add as the code is explicitly creating the lookup table and then using it. This is one of the simplest approaches that you will see in many projects to date. It's fine until you start messing with keys. Imagine somebody would reorder TID_UINT8 to follow TID_INT8 including the TID value. The table would immediately be broken, but the code would compile just fine - the runtime is where it will fail and such bugs can be really time consuming to debug as everything looks just perfect.

Approach #2 - Key/Value as a Macro

The idea is to create a macro that will be used N times to populate the table:

#include <stdint.h>
#include <stdlib.h>

enum TypeId : uint32_t {
  TID_NULL    = 0,
  TID_BOOLEAN = 1,
  TID_INT8    = 2,
  TID_INT16   = 3,
  TID_INT32   = 4,
  TID_INT64   = 5,
  TID_UINT8   = 6,
  TID_UINT16  = 7,
  TID_UINT32  = 8,
  TID_UINT64  = 9,
  TID_FLOAT32 = 10,
  TID_FLOAT64 = 11,
  TID_STRING  = 12,
  TID_ARRAY   = 13,
  TID_MAP     = 14,

  TID_COUNT   = 15
};

#define V(tid) (              \
  tid == TID_NULL    ?  0 :   \
  tid == TID_BOOLEAN ?  1 :   \
  tid == TID_INT8    ?  1 :   \
  tid == TID_INT16   ?  2 :   \
  tid == TID_INT32   ?  4 :   \
  tid == TID_INT64   ?  8 :   \
  tid == TID_UINT8   ?  1 :   \
  tid == TID_UINT16  ?  2 :   \
  tid == TID_UINT32  ?  4 :   \
  tid == TID_UINT64  ?  8 :   \
  tid == TID_FLOAT32 ?  4 :   \
  tid == TID_FLOAT64 ?  8 :   \
  tid == TID_STRING  ? 16 :   \
  tid == TID_ARRAY   ? 24 :   \
  tid == TID_MAP     ? 32 : 0 \
)

static const uint8_t ObjectSizeArray[] = {
  V(0),
  V(1),
  V(2),
  V(3),
  V(4),
  V(5),
  V(6),
  V(7),
  V(8),
  V(9),
  V(10),
  V(11),
  V(12),
  V(13),
  V(14)
};

#undef V

size_t objectSize(uint32_t tid) {
  if (tid >= TID_COUNT)
    return SIZE_MAX;
  else
    return ObjectSizeArray[tid];
}

[Compiler Explorer]

The result should be compiled to the same code as the previous example. We fixed one thing and designed a solution which would work for simple cases in which a key is changed, however, what we have done is not really a nice solution and requires a lot of additional lines compared to the first one.

Approach #3 - Key/Value as a Template

#include <stdint.h>
#include <stdlib.h>

enum TypeId : uint32_t {
  TID_NULL    = 0,
  TID_BOOLEAN = 1,
  TID_INT8    = 2,
  TID_INT16   = 3,
  TID_INT32   = 4,
  TID_INT64   = 5,
  TID_UINT8   = 6,
  TID_UINT16  = 7,
  TID_UINT32  = 8,
  TID_UINT64  = 9,
  TID_FLOAT32 = 10,
  TID_FLOAT64 = 11,
  TID_STRING  = 12,
  TID_ARRAY   = 13,
  TID_MAP     = 14,

  TID_COUNT   = 15
};

template<uint32_t tid>
struct KeyMapper {
  enum : uint8_t {
    value =
      tid == TID_NULL    ?  0 :
      tid == TID_BOOLEAN ?  1 :
      tid == TID_INT8    ?  1 :
      tid == TID_INT16   ?  2 :
      tid == TID_INT32   ?  4 :
      tid == TID_INT64   ?  8 :
      tid == TID_UINT8   ?  1 :
      tid == TID_UINT16  ?  2 :
      tid == TID_UINT32  ?  4 :
      tid == TID_UINT64  ?  8 :
      tid == TID_FLOAT32 ?  4 :
      tid == TID_FLOAT64 ?  8 :
      tid == TID_STRING  ? 16 :
      tid == TID_ARRAY   ? 24 :
      tid == TID_MAP     ? 32 : 0
  };
};

static const uint8_t ObjectSizeArray[] = {
  KeyMapper<0>::value,
  KeyMapper<1>::value,
  KeyMapper<2>::value,
  KeyMapper<3>::value,
  KeyMapper<4>::value,
  KeyMapper<5>::value,
  KeyMapper<6>::value,
  KeyMapper<7>::value,
  KeyMapper<8>::value,
  KeyMapper<9>::value,
  KeyMapper<10>::value,
  KeyMapper<11>::value,
  KeyMapper<12>::value,
  KeyMapper<13>::value,
  KeyMapper<14>::value
};

size_t objectSize(uint32_t tid) {
  if (tid >= TID_COUNT)
    return SIZE_MAX;
  else
    return ObjectSizeArray[tid];
}

[Compiler Explorer]

Okay, the nasty macro was replaced by a nasty template, but overall it's the same solution that is still annoying to write / maintain.

Approach #4 - std::index_sequence and constexpr (C++14)

#include <stdint.h>
#include <stdlib.h>
#include <utility>

// ------------------------------------------------------------------
// Normally you would put these helpers somewhere else, not into the
// the .cpp file, however, let's keep it simple.
// ------------------------------------------------------------------

template<typename T, size_t N>
struct LookupTable {
  T data[N];

  constexpr size_t size() const noexcept { return N; }
  constexpr const T& operator[](size_t i) const noexcept { return data[i]; }
};

template<typename T, size_t N, class Func, size_t... Indexes>
static constexpr LookupTable<T, N> MakeLookupTableImpl(Func f, std::index_sequence<Indexes...>) noexcept {
  return {{ T(f(Indexes))... }};
}

template<typename T, size_t N, class Func>
static constexpr LookupTable<T, N> MakeLookupTable(Func f) noexcept {
  return MakeLookupTableImpl<T, N>(f, std::make_index_sequence{});
}

// The original code...
enum TypeId : uint32_t {
  TID_NULL    = 0,
  TID_BOOLEAN = 1,
  TID_INT8    = 2,
  TID_INT16   = 3,
  TID_INT32   = 4,
  TID_INT64   = 5,
  TID_UINT8   = 6,
  TID_UINT16  = 7,
  TID_UINT32  = 8,
  TID_UINT64  = 9,
  TID_FLOAT32 = 10,
  TID_FLOAT64 = 11,
  TID_STRING  = 12,
  TID_ARRAY   = 13,
  TID_MAP     = 14,

  TID_COUNT   = 15
};

// And the original function turned into a constexpr.
static constexpr uint8_t ObjectSizeFunc(uint32_t tid) {
  return tid == TID_NULL    ?  0 :
         tid == TID_BOOLEAN ?  1 :
         tid == TID_INT8    ?  1 :
         tid == TID_INT16   ?  2 :
         tid == TID_INT32   ?  4 :
         tid == TID_INT64   ?  8 :
         tid == TID_UINT8   ?  1 :
         tid == TID_UINT16  ?  2 :
         tid == TID_UINT32  ?  4 :
         tid == TID_UINT64  ?  8 :
         tid == TID_FLOAT32 ?  4 :
         tid == TID_FLOAT64 ?  8 :
         tid == TID_STRING  ? 16 :
         tid == TID_ARRAY   ? 24 :
         tid == TID_MAP     ? 32 : 0;
}

static constexpr const auto ObjectSizeArray =
  MakeLookupTable<uint8_t, TID_COUNT>(ObjectSizeFunc);

size_t objectSize(uint32_t tid) {
  if (tid >= TID_COUNT)
    return SIZE_MAX;
  else
    return ObjectSizeArray[tid];
}

[Compiler Explorer]

I think this is so far the cleanest approach that can be maintained very well. Adding a new value still requires to update the ObjectSizeFunc function, but changing keys would do no harm.

Approach #5 - Minor update of #4

#include <stdint.h>
#include <stdlib.h>
#include <utility>

template<typename T, size_t N>
struct LookupTable {
  T data[N];

  constexpr size_t size() const noexcept { return N; }
  constexpr const T& operator[](size_t i) const noexcept { return data[i]; }
};

template<typename T, size_t N, class Generator, size_t... Indexes>
static constexpr LookupTable<T, N> MakeLookupTableImpl(std::index_sequence<Indexes...>) noexcept {
  constexpr LookupTable<T, N> table {{ T(Generator::value(Indexes))... }};
  return table;
}

template<typename T, size_t N, class Generator>
static constexpr LookupTable<T, N> MakeLookupTable() noexcept {
  return MakeLookupTableImpl<T, N, Generator>(std::make_index_sequence<N>{});
}

enum TypeId : uint32_t {
  TID_NULL    = 0,
  TID_BOOLEAN = 1,
  TID_INT8    = 2,
  TID_INT16   = 3,
  TID_INT32   = 4,
  TID_INT64   = 5,
  TID_UINT8   = 6,
  TID_UINT16  = 7,
  TID_UINT32  = 8,
  TID_UINT64  = 9,
  TID_FLOAT32 = 10,
  TID_FLOAT64 = 11,
  TID_STRING  = 12,
  TID_ARRAY   = 13,
  TID_MAP     = 14,

  TID_COUNT   = 15
};

struct ObjectSizeGenerator {
  static constexpr uint8_t value(size_t tid) noexcept {
    return tid == TID_NULL    ?  0 :
           tid == TID_BOOLEAN ?  1 :
           tid == TID_INT8    ?  1 :
           tid == TID_INT16   ?  2 :
           tid == TID_INT32   ?  4 :
           tid == TID_INT64   ?  8 :
           tid == TID_UINT8   ?  1 :
           tid == TID_UINT16  ?  2 :
           tid == TID_UINT32  ?  4 :
           tid == TID_UINT64  ?  8 :
           tid == TID_FLOAT32 ?  4 :
           tid == TID_FLOAT64 ?  8 :
           tid == TID_STRING  ? 16 :
           tid == TID_ARRAY   ? 24 :
           tid == TID_MAP     ? 32 : 0;
  }
};

static constexpr const auto ObjectSizeArray =
  MakeLookupTable<uint8_t, TID_COUNT, ObjectSizeGenerator>();

size_t objectSize(uint32_t tid) {
  if (tid >= TID_COUNT)
    return SIZE_MAX;
  else
    return ObjectSizeArray[tid];
}

[Compiler Explorer]

This is an updated approach in which I have moved the generator into a helper struct/class as I noticed that in some cases MSVC would emit runtime initializers of the lookup table if its are part of a larger structure.

Conclusion

I just wanted to demonstrate different ways of creating lookup tables at compile-time. If you explore enough projects you will probably find all approaches mentioned here with some minor modifications (like more macros, more clever macros, different templates, etc...). I'm sure there is maybe even a better solution than the last one, but I'm currently happily switching to it as it provides great flexibility to me. The function that returns the value can be actually pretty complex and the constexpr [ObjectSizeArray] would still guarantee that the compiler will calculate everything at compile-time or fail. I have some tables in Blend2D that are used during glyph decoding (of both TrueType and OpenType fonts) and they work really well and are really easy to maintain. I plan to use such approach in AsmJit as well, but Blend2D still has a higher priority at the moment.

I would also note that the last approach also provides a great flexibility to define tables that may be a bit sparse. Like a table that has 256 values [mapping a byte key to a value], but only 10 of them are let's say non-zero [or keys of interest]

Use comments if you have anything to add or if you think it should be done differently!

2 comments:

  1. The solution I have usually used for this is x-macros.

    First, define something like this, as the x-macro for your TID stuff:

    #define X_TID(f) \
    f(TID_NULL, 0, 0) \
    f(TID_BOOLEAN, 1, 1) \
    f(TID_INT8, 2, 1) \
    f(TID_INT16, 3, 2) \
    // etc ...

    This x-macro is the one "source of truth" for the TID definitions, and in this case includes the two independent bit of data that are associated with each element: it's ID and the size. If you just wanted an incrementing ID (i.e., you didn't care about the specific value, you could work that into the deifnition as well), so you'd only need to specify the size.

    Then for each of the definitions you just need to invoke the x-macro with a different f function. Here's the enum:

    #define ENUM_HELPER(name, id, size) name = id,

    enum TypeId : uint32_t {
    X_TID(ENUM_HELPER)
    };

    Here's the LUT:

    #define LUT_HELPER(name, id, size) tid == name ? size :

    size_t objectSize(uint32_t tid) {
    return
    X_TID(LUT_HELPER)
    : 0;
    }



    ReplyDelete
  2. The formatting kind of sucks, but take my word for it that I had lined everything up nicely in the original post...

    ReplyDelete