When you need pseudo-random numbers in C code, commonly used routines are random()
, rand()
and rand_r(unsigned int*)
. First and foremost, it is important to know that none of these routines produce high quality random numbers suitable for cryptographic use. But there are cases where you just need some random looking numbers. Let’s focus on these use-cases in this blog post.
For example, let’s assume you have a multi-threaded OpenMP program where you need to generate different pseudo-random numbers in each thread. A natural approach to this would be the following:
#pragma omp parallel { unsigned seed = omp_get_thread_num(); #pragma omp for for (...) { int rnd = rand_r(&seed); ... } }
Each thread stores its own state of the random number generator and initializes (seeds) it with its thread number. Therefore, each thread should obtain different random numbers, although they will be idential on each run due to the deterministically chosen initial values. So, what’s wrong with this code? Let’s have a look at the very first random numbers generated for each thread on macOS 12:
Thread 0 (seed=0): 520932930 Thread 1 (seed=1): 16807 Thread 2 (seed=2): 33614 Thread 3 (seed=3): 50421 Thread 4 (seed=4): 67228
For seed=1,2,3,...
, the generated numbers look not at all random. Even in scenarios where you only need random looking values, this is probably not what you want. The reason for this behavior is the pseudo random number generator (PRNG) used to implement rand_r(unsigned int*)
on macOS. It’s MINSTD, a so-called Lehmer RNG which could be impemented as follows:
#define RAND_MAX 0x7fffffff int rand_r(unsigned *seed) { if (*seed == 0) *seed = 123459876; *seed = ((unsigned long long)*seed * 16807) % RAND_MAX; return *seed; }
So basically, the seed is just multiplied by 16807 and since seed=0
would never generate any other number than 0, it needs to be mapped to some other value. In this case it’s behaving like seed=123459876
. I took this observation as a reason to look at the generated random numbers for seed=0..4
for different libc implementations.
rand_r(unsigned int*)
The PRNG behind rand_r(unsigned int*)
is inherently limited by the API to have an internal state consisting of only a single unsigned integer. This makes generating good pseudo-random numbers difficult.
macOS 12.1
Seed Generated number 0 520932930 1 16807 2 33614 3 50421 4 67228
- MINSTD with
seed=0
being equal toseed=123459876
. - Values ramp up for small seeds, so you probably want to seed it with numbers significantly larger than 1.
- The implementation seems to originate from earlier versions of FreeBSD.
glibc 2.31
Seed Generated number 0 1012484 1 476707713 2 952403967 3 1430195325 4 1905891579
- Values seem to ramp up for small seeds, so you probably want to seed it with numbers significantly larger than 1.
- Looking at the implementation, this behavior is not immediately obvious.
FreeBSD 13.0
Seed Generated number 0 16806 1 33613 2 50420 3 67227 4 84034
- Looks like MINSTD with a constant value of 16806 added to each random number. This allows
seed=0
to generate a number different to 0. - Values seem to ramp up for small seeds, so you probably want to seed it with numbers significantly larger than 1.
- Link to the implementation
random()
random()
is a suggested replacement for rand()
and rand_r(unsigned int*)
and supposed to provide better pseudo-random numbers.
macOS 12.1
Seed Generated number 0 577655601 1 1804289383 2 1505335290 3 1205554746 4 1968078301
glibc 2.31
Seed Generated number 0 1804289383 1 1804289383 2 1505335290 3 1205554746 4 1968078301
- Looks like the macOS implementation with different handling of
seed=0
. More specifically:seed=0
is handled likeseed=1
.
FreeBSD 13.0
Seed Generated number 0 1707946985 1 1408992891 2 1109212348 3 1871735903 4 493669277
rand()
macOS 12.1
Seed Generated number 0 520932930 1 16807 2 33614 3 50421 4 67228
- Identical to macOS’s
rand_r(unsigned int*)
implementation with all of its issues.
glibc 2.31
Seed Generated number 0 1804289383 1 1804289383 2 1505335290 3 1205554746 4 1968078301
- Identical to glibc’s
random()
implementation. Be aware ofseed=0
andseed=1
behaving identically.
FreeBSD 13.0
Seed Generated number 0 1707946985 1 1408992891 2 1109212348 3 1871735903 4 493669277
What about C++?
C++ implements different PRNGs in the STL that can be selected by the user and whose behavior is defined in the standard, so they should behave identically on all platforms. For example, there is the Mersenne Twister RNG std::mt19937
:
Seed Generated number 0 2357136044 1 1791095845 2 1872583848 3 2365658986 4 4153361530
If you really want to, you can also get MINSTD as std::minstd_rand0 with seed=0
being equal to seed=1
:
Seed Generated number 0 16807 1 16807 2 33614 3 50421 4 67228
Bottom line
- If you really want to stick to libc in a single-threaded scenario: Use
random()
and make sure to never useseed=0
. - If you use glibc and need different pseudo-random numbers in different threads: There is
random_r(struct random_data *restrict, int32_t *restrict)
which I have not tried but which should provide the quality ofrandom()
with the possibility of having thread-local RNG states. - If you really need to use
rand_r(unsigned int*)
, make sure to use seeds significantly larger than 1.
Final remarks
All of the above only takes into account PRNGs from the C and C++ standard library. There are dozens of better choices, both performance and quality wise:
- A very fast family of PRNGs is the Xoshiro/Xoroshiro. For example, on an Intel Core i5-5287U I am able to generate 1.25 billion single-precision floats per second using the Xoshiro128+ PRNG and doing a bit of math with them. There are reference implementations in C by the authors and a nice C++ library available. Very recently, a scientific article on these PRNGs has been published in ACM TOMS (preprint).
- Another modern family of PRNGs is PCG. There are a comprehensive C++ implementation and a basic C implementation available directly from the authors.
- To obtain hardware-generated, cryptographically secure pseudo-random numbers, modern Intel and AMD processors provide specialized hardware instructions (
rdrand
/rdseed
). For Intel processors, Intel provides a corresponding C library. - There are also multiple PRNGs implementened in the GNU Scientific Library (GSL), Intel MKL and probably many other libraries.