0xStubs

System Administration, Programming and Reconfigurable Computing

Using the C++23 std Module with Clang 18

A feature introduced in C++20 is modules. With C++23, the standard library is now supposed to provide an std module that can be used instead of all the #includes that we are used to. This allows code like the following:

import std;

int main() {
  std::print("Hello world!\n");
}

While (partial) support for modules is included in GCC since version 11 and Clang since version 8, support for the std module is currently still lacking in libstdc++ and only included in libc++ since version 17 (see https://en.cppreference.com/w/cpp/compiler_support). But since I have Clang 18 and libc++ installed, I wanted to give it a try.

Simply trying to compile above code, unfortunately leads to an error:

% clang++ -std=c++23 -stdlib=libc++ -o test test.cpp 
test.cpp:1:8: fatal error: module 'std' not found
    1 | import std;
      | ~~~~~~~^~~
1 error generated.

One might be tempted to try Clang’s -fmodules flag but in fact this does not refer to modules as defined by the C++ standard and using it actually causes more issues:

% clang++ -std=c++23 -stdlib=libc++ -fmodules -o test test.cpp
test.cpp:1:1: error: import of module 'std' imported non C++20 importable modules
    1 | import std;
      | ^
test.cpp:4:8: error: no member named 'print' in namespace 'std'
    4 |   std::print("Hello world!\n");
      |   ~~~~~^
2 errors generated.

Most people experimenting with modules use CMake as build system and this is what is suggested by the libc++ documentation. However, the page also generally describes that we need to generate a BMI (Built Module Interface) file (.pcm) from the module sources (.cppm). A script posted in the LLVM discourse was very helpful in determining the necessary steps. First, the module header needs to be precompiled:

clang++ -std=c++23 -stdlib=libc++ \
    -Wno-reserved-identifier -Wno-reserved-module-identifier \
    --precompile -o std.pcm /usr/share/libc++/v1/std.cppm

The location of the cppm file may of course differ on different systems. Also, shipping the module source currently must be explicitly activated when building libc++. After precompiling the module, we can compile our program as follows:

clang++ -std=c++23 -stdlib=libc++ \
    -fmodule-file=std=std.pcm -o test std.pcm test.cpp

Note that we need to specify where the std module can be found via -fmodule-file and also need to specify it as an input file. All of this can of course be automated in a Makefile:

CXX = clang++
CXXFLAGS += -Weverything -Wno-c++98-compat -Wno-pre-c++20-compat
CXXFLAGS += -std=c++23 -stdlib=libc++

test: std.pcm test.cpp
	$(CXX) $(CXXFLAGS) -fmodule-file=std=std.pcm -o $@ $^

std.pcm: /usr/share/libc++/v1/std.cppm
	$(CXX) $(CXXFLAGS) -Wno-reserved-identifier -Wno-reserved-module-identifier --precompile -o $@ $^

Using the C++ bool type in HPC codes

When writing C++ code, you are probably inclined to use data types that are closest to what you are trying to express, expecting that this allows the compiler to provide the most efficient implementation possible. So, when processing boolean values, i.e., single-bit information distinguishing between true and false, you will likely want to use the bool type in C++. However, this bool type may behave in unexpected ways, in particular if you are working with codes that have to perform well and correctly in multi-threaded codes.

Read More

libc’s random number generators and what to be aware of when seeding them

When you need pseudo-random numbers in C code, commonly used routines are random(), rand() and rand_r(unsigned int*). First and foremost, it is important to know that none of these routines produce high quality random numbers suitable for cryptographic use. But there are cases where you just need some random looking numbers. Let’s focus on these use-cases in this blog post.

For example, let’s assume you have a multi-threaded OpenMP program where you need to generate different pseudo-random numbers in each thread. A natural approach to this would be the following:

#pragma omp parallel
{
    unsigned seed = omp_get_thread_num();
    #pragma omp for
    for (...) {
        int rnd = rand_r(&seed);
        ...
    }
}

Each thread stores its own state of the random number generator and initializes (seeds) it with its thread number. Therefore, each thread should obtain different random numbers, although they will be idential on each run due to the deterministically chosen initial values. So, what’s wrong with this code? Let’s have a look at the very first random numbers generated for each thread on macOS 12:

Read More

Modifying environment variables in the Atom editor

If you are using the Atom editor, you may at some point need to set or modify certain environment variables within your editor, e.g., to allow packages to locate binaries that are not located within your normal $PATH. A solution you often find on the internet is to add a small snippet to your init.coffee script:

process.env.PATH = [
  '/my/special/path/for/atom'
  process.env.PATH
].join(':')

However, this does not always work. Even worse: If it works or not may change each time you launch Atom. So, what is the problem here?

During launch, Atom runs a routine called updateProcessEnv() that configures the environment. However, to not delay startup unnecessarily, this is an asynchronous function. During launch, it is called, then the user’s init.coffee script is run, and only after that the completion of updateProcessEnv is awaited. So if you modify the environment within init.coffee, there is a good chance that Atom’s startup procedure will overwrite it again a couple of milliseconds later.

So, how can we deal with this and reliably modify the environment? Luckily, Atom emits an event as soon as the environment has been setup. So we can use a construct like the following in init.coffee:

modEnv = ->
  return unless atom.shellEnvironmentLoaded
  process.env.PATH = [
    '/my/special/path/for/atom'
    process.env.PATH
  ].join(':')
modEnv()
atom.emitter.on 'loaded-shell-environment', modEnv