04 June, 2017

Idio[ma]tic Cmake

Welcome to Trial and Error Scripting

If I counted the time I spent on figuring out why CMake doesn't work as expected I would have probably counted weeks. I don't think that CMake itself is a bad tool, however, I think that the CMake script is the most idiotic language ever invented, seriously. The other problem is also the documentation, as I have never solved anything by reading CMake docs (I probably wasted more time by reading it actually).

Consider a very small example that should describe an issue I was dealing with. We have some files that are always compiled as a part of some target, and some files that require specific compiler flags, for example `-mavx2` and some compile-time constants:

project(simple_app C CXX)          # CMake project.

set(SOURCE_FILES main.cpp)         # Source files that are always compiled.
set(CUSTOM_FLAGS -DCUSTOM_IMPL=1)  # Custom compiler flags we want to append to a specific file.

if (BUILD_CUSTOM_FILES)
  set(CUSTOM_FILES impl_avx2.cpp)
  set(AVX2_FLAGS ${CUSTOM_FLAGS} -DAVX2_AVAILABLE=1 -mavx2)
  set_property(SOURCE ${CUSTOM_FILES} APPEND PROPERTY COMPILE_FLAGS ${CUSTOM_FLAGS})

  # Add all arch-specific files to SOURCE_FILES...
  list(APPEND SOURCE_FILES ${CUSTOM_FILES})
endif()

add_executable(test_app ${SOURCE_FILES})

The problem is that it will not work and you will have hard time figuring it out. The compiler command CMake generates would look like this for compiling impl_avx2.cpp:

/usr/bin/c++ -DCUSTOM_IMPL=1;-DAVX2_AVAILABLE=1;-mavx2 -o impl_avx2.o -c impl_avx2.cpp

Which is of course completely broken and contains semicolons instead of spaces. The reason behind this is that CMake script doesn't really support arrays, all arrays are strings separated by semicolons. Actually, these two lines are the same:

set(SOMETHING A B)
set(SOMETHING "A;B")

And there is no way to distinguish between these two. To make it clearer what is happening I wrote a simple test script:

function(my_func PREFIX FIRST)
  message("${PREFIX} FIRST=${FIRST}")
  SET(ARG_INDEX 0)
  foreach(ARG_VA ${ARGN})
    message("${PREFIX} #${ARG_INDEX} ${ARG_VA}")
    math(EXPR ARG_INDEX "${ARG_INDEX}+1")
  endforeach()
endfunction()

my_func("1:" arg)
my_func("2:" arg second)
my_func("3:" arg second third)
my_func("4:" arg "second;third")
my_func("5:" arg "second third")

Which outputs:

1: FIRST=arg
2: FIRST=arg
2: #0 second
3: FIRST=arg
3: #0 second
3: #1 third
4: FIRST=arg
4: #0 second
4: #1 third
5: FIRST=arg
5: #0 second third

Okay, so we know that cmake treats semicolons as separators, so what we can do is simply foreach() each flag and append it, so let's modify the first example:

project(simple_app C CXX)

set(SOURCE_FILES main.cpp)
set(CUSTOM_FLAGS -DCUSTOM_IMPL=1)

if (BUILD_CUSTOM_FILES)
  set(CUSTOM_FILES impl_avx2.cpp)
  set(AVX2_FLAGS ${CUSTOM_FLAGS} -DAVX2_AVAILABLE=1 -mavx2)
  foreach(flag ${CUSTOM_FLAGS})
    set_property(SOURCE ${CUSTOM_FILES} APPEND PROPERTY COMPILE_FLAGS ${flag})
  endforeach()

  # Add all arch-specific files to SOURCE_FILES...
  list(APPEND SOURCE_FILES ${CUSTOM_FILES})
endif()

add_executable(test_app ${SOURCE_FILES})

Well, the output would be the same as before, just try it:

/usr/bin/c++ -DCUSTOM_IMPL=1;-DAVX2_AVAILABLE=1;-mavx2 -o impl_avx2.o -c impl_avx2.cpp

Would you expect this? CMake developers are actually aware of it and to make things even more confusing we have APPEND and APPEND_STRING options. APPEND just appends the given property making it a list, which is then stringified with the semicolons and we are at the beginning. APPEND_STRING always appends to a RAW string instead:

project(simple_app C CXX)

set(SOURCE_FILES main.cpp)
set(CUSTOM_FLAGS -DCUSTOM_IMPL=1)

if (BUILD_CUSTOM_FILES)
  set(CUSTOM_FILES impl_avx2.cpp)
  set(AVX2_FLAGS ${CUSTOM_FLAGS} -DAVX2_AVAILABLE=1 -mavx2)
  foreach(flag ${CUSTOM_FLAGS})
    set_property(SOURCE ${CUSTOM_FILES} APPEND_STRING PROPERTY COMPILE_FLAGS ${flag})
  endforeach()

  list(APPEND SOURCE_FILES ${CUSTOM_FILES})
endif()

add_executable(test_app ${SOURCE_FILES})

Which yields:

/usr/bin/c++ -DCUSTOM_IMPL=1-DAVX2_AVAILABLE=1-mavx2 -o impl_avx2.o -c impl_avx2.cpp

Cool, we got rid off semicolons but have no spaces between our flags as a side effect. The problem is that CMake's COMPILE_FLAGS is in fact a string, not a list, so to append the flag properly we must append a space before it, which will of course insert a leading space if the property was empty:

project(simple_app C CXX)

set(SOURCE_FILES main.cpp)
set(CUSTOM_FLAGS -DCUSTOM_IMPL=1)

if (BUILD_CUSTOM_FILES)
  set(CUSTOM_FILES impl_avx2.cpp)
  set(AVX2_FLAGS ${CUSTOM_FLAGS} -DAVX2_AVAILABLE=1 -mavx2)
  foreach(flag ${CUSTOM_FLAGS})
    set_property(SOURCE ${CUSTOM_FILES} APPEND_STRING PROPERTY COMPILE_FLAGS " ${flag}")
  endforeach()

  list(APPEND SOURCE_FILES ${CUSTOM_FILES})
endif()

add_executable(test_app ${SOURCE_FILES})

Which is quasi working:

/usr/bin/c++  -DCUSTOM_IMPL=1 -DAVX2_AVAILABLE=1 -mavx2 -o impl_avx2.o -c impl_avx2.cpp

Now I would like to ask you, would you write the working version at the beginning? Because for me this was simply a trial and error until I found a solution that worked; and I personally don't like this approach of solving problems.

Time to Migrate Away?

CMake should really switch to a sane language otherwise I can't see using it in the future. I have already checked Meson as it was mentioned on several sites that I visit. Is it better? It probably is, but it's another one that employs a home-grown language that you probably cannot debug and forces you to write weird shell scripts as part of your project definition. I mean why to invent a language that cannot do the task and requires to run a shell script to list files in a directory?

I'm Looking for a project generator that uses embedded JavaScript and can be debugged like a normal programming language or something really close to it. It would be similar to C/C++ syntactically and could be linted by existing tools. I don't see a reason why to invent a new language for something like a project generator. It's kind of paradox that all C/C++ project generators use languages that are not even close to C and require you to write 5 lines to implement a simple if/else construct.

10 comments:

  1. Petr, you might take a look at: https://github.com/apenwarr/redo pretty cool and simple concept.

    ReplyDelete
  2. You might like Premake. It's scripted using Lua, a well established and much better designed scripting language.

    ReplyDelete
  3. Why not simply use double quotes:

    set(AVX2_FLAGS "${CUSTOM_FLAGS} -DAVX2_AVAILABLE=1 -mavx2")

    CMake expands variables inside double quote expressions and does not convert to a list with semicolons.
    It works just as it works in bahs or Makefiles.

    ReplyDelete
    Replies
    1. Yes, i gonna point that there are much simpler way of achieving the desired effect. Otherwise, I agree that the language seems inadequate in many cases. Still, the biggest impediment to cmake is that the moment you move to more advanced stuff you will hit a wall for sure due to lack of advanced (more complete) examples. The book is a joke, only basics + 2/3 print of the api :D

      Delete
    2. I would like to note that the use-case that I shown in this blog is part of something much more advanced and cross-platform - so no hardcoded flags, only a lot of variables holding flags and rules to use them per source-file basis. This is actually used in my Blend2D project where I apply additional cflags based on a file suffix. So for example if a file looks like `sourcefile_avx2.cpp` flags that enable AVX2 are automatically added to it so it can compile. Also, if the compiler doesn't support these intrinsics the file is not added to the list of files to be compiled. As I said, it's much more complex and I just wanted to demonstrate my problem on a simpler example, not on a 1k cmake script nobody would understand.

      Thanks for your feedback!

      Delete
  4. > I'm Looking for a project generator that uses embedded JavaScript
    Okay, you are looking for QBS.

    ReplyDelete
  5. What about COMPILE_DEFINITIONS/COMPILE_OPTIONS, which actually are lists and should work as expected, or even better, calling target_compile_definitions/target_compile_options? AFAIK COMPILE_FLAGS is deprecated since at least CMake 3.0 (3.8 begin the lastest release version). I hope I'm not sounding too harsh, but your example doesn't look like idiomatic CMake to me, rather it looks severely outdated

    ReplyDelete
    Replies
    1. I'm not adding cflags to the target, but to individual files instead; and according to the documentation:

      https://cmake.org/cmake/help/v3.9/manual/cmake-properties.7.html#properties-on-source-files

      only `COMPILE_FLAGS` can be used to change cflags of individual files (`COMPILE_OPTIONS` is not a source-file property and believe me I would have used if it was).

      I don't want to be harsh, but read the damn thing first :)

      Delete
    2. Right... My bad. :) I think my CMake Stockholm syndrome was activated, and my rationality was turned off, sorry for that. You have a good point about writing a working version from the beginning being difficult.
      COMPILE_OPTIONS not being a source file property looks like a gross oversight and is one of the many inconsistencies of CMake, IMHO.

      One way around this could be to create an OBJECT target with the files you need extra flags for, but that seems like a huge hassle if you only need it for single files. Also, having to 'work around' something like this really doesn't feel right.

      Delete
    3. It's fine :)

      I think too that there is no justification `COMPILE_OPTIONS` doesn't exist for source files. CMake slowly transformed from strings to lists, but it seems that the transformation is still not yet complete.

      I haven't thought about creating an OBJECT target to be honest; it would be probably overkill, but it's good to know this possibility exists.

      Delete