# Higher quality random floats

Much has been written about how to generate random 64-bit integers; people might argue over which *particular* (CS)PRNG is best suited for a given task, but there is agreement on the general shape of the solution: keep some state around, and then extract 64 bits of randomness at a time, where each bit is equally likely to be either zero or one. This gives nice uniformly-distributed integers over the range [0, 2^{64}).

There is less agreement on how to generate random floating-point values. It is tempting to aim for what looks like a simple transformation of an integer generator: uniformly-distributed floats over the range [0, 1). There are various ways of doing such a transformation; one exposition does it as:

```
double pcg64_random_double(pcg64_t *rng) {
return (double)(pcg64_random(rng) >> 11) * 0x1.0p-53;
}
```

Whereas LuaJIT does it as:

```
uint64_t lj_prng_u64d(PRNGState *rs) {
uint64_t z, r = 0;
TW223_STEP(rs, z, r)
/* Returns a double bit pattern in the range 1.0 <= d < 2.0. */
return (r & 0x000fffffffffffffull) | 0x3ff0000000000000ull;
}
/* Then to give d in [0, 1) range: */
U64double u;
double d;
u.u64 = lj_prng_u64d(rs);
d = u.d - 1.0;
```

And a Go version by Lemire does it as:

```
// toFloat64 -> [0,1)
func toFloat64(seed *uint64) float64 {
x := splitmix64(seed)
x &= 0x1fffffffffffff // %2**53
return float64(x) / float64(0x1fffffffffffff)
}
```

The [0, 1) output range is less useful than it might initially appear. It is often stated that [0, 1) can be transformed to [A, B) by multiplying by `B-A`

then adding `A`

, but this is only true for *some* values of A and B; for other values, a number just less than 1 can be transformed and end up *equal* to B due to floating-point rounding, i.e. [0, 1) is transformed to [A, B]. The possibility of a zero output is also unhelpful if the result is going to be log-transformed (for example as part of a Box-Muller transform), as log(0) explodes.

An output range of [0, 1] would be better behaved under most additive and multiplicative transforms. After such a transform, a range like [A, B] can be easily turned into [A, B) via rejection sampling, should the half-open range be desired. It will also turn out to be very easy (on either theoretical or practical grounds) to exclude 0 as a possible output, giving the log-friendly output range (0, 1].

Before trying to generate a random float in the range [0, 1], we should instead consider the easier problem of generating a random float in the range [0.5, 1]. There are 2^{52}+1 different IEEE-754 doubles in this range. The "rounding basin" of some double `d`

in that range can be defined as the range of infinite-precision real numbers in the range [0.5, 1] that would round to `d`

(assuming round-to-nearest). Taking ε=2^{-54}, most of these doubles have a basin of `d-ε`

through `d+ε`

. The exceptions are the extremes of the range; 0.5 has a basin of `0.5`

through `0.5+ε`

, and 1 has a basin of `1-ε`

through `1`

. In other words, there are 2^{52}-1 values in the range whose basin is `2ε`

wide, and 2 values in the range whose basin is only `ε`

wide. For a uniform distribution, the probability of seeing `d`

should equal the width of the rounding basin of `d`

. Given all this, there is a fairly simple monotonic transform from the uniform integer range [0, 2^{53}) to the uniform float range [0.5, 1]:

Integer(s) | Resultant floating-point double (ε=2^{-54}) |
---|---|

0 | 0.5 |

1, 2 | 0.5 + 2ε |

3, 4 | 0.5 + 4ε |

5, 6 | 0.5 + 6ε |

⋮ | ⋮ |

2^{53}-7, 2^{53}-6 | 1 - 6ε |

2^{53}-5, 2^{53}-4 | 1 - 4ε |

2^{53}-3, 2^{53}-2 | 1 - 2ε |

2^{53}-1 | 1 |

This transform can be expressed in code like so:

```
double rand_between_half_and_one() {
double d;
uint64_t x = rand_u64() >> 11; /* 53-bit uniform integer */
x = ((x + 1) >> 1) + (1022ull << 52);
memcpy(&d, &x, sizeof(d));
return d;
}
```

The range [0.25, 0.5] looks a lot like the range [0.5, 1], except that ε reduces to 2^{-55}. That is, there is a fairly simple monotonic transform from the uniform integer range [0, 2^{53}) to the uniform float range [0.25, 0.5]:

Integer(s) | Resultant floating-point double (ε=2^{-55}) |
---|---|

0 | 0.25 |

1, 2 | 0.25 + 2ε |

3, 4 | 0.25 + 4ε |

5, 6 | 0.25 + 6ε |

⋮ | ⋮ |

2^{53}-7, 2^{53}-6 | 0.5 - 6ε |

2^{53}-5, 2^{53}-4 | 0.5 - 4ε |

2^{53}-3, 2^{53}-2 | 0.5 - 2ε |

2^{53}-1 | 0.5 |

Or in code:

```
double rand_between_quarter_and_half() {
double d;
uint64_t x = rand_u64() >> 11; /* 53-bit uniform integer */
x = ((x + 1) >> 1) + (1021ull << 52);
memcpy(&d, &x, sizeof(d));
return d;
}
```

In turn, the range [0.125, 0.25] looks a lot like [0.25, 0.5], and [0.0625, 0.125] looks a lot like [0.125, 0.25], and so on. To generate a float in [0, 1], we need to choose one of these ranges, and then choose a value within that range. For a uniform distribution, the probability of a range should be proportional to the width of the range, so we have:

Output range | Width | Probability |
---|---|---|

[0.5, 1] | 2^{-1} | 50% |

[0.25, 0.5] | 2^{-2} | 25% |

[0.125, 0.25] | 2^{-3} | 12.5% |

[0.0625, 0.125] | 2^{-4} | 6.25% |

⋮ | ⋮ | ⋮ |

As the width of the range approaches zero, so does the probability of the range. Once the width is less than 2^{-75}, the probabilities are so small as to be effectively impossible for most practical purposes. By making them *actually* impossible, we gain a nice guarantee of algorithm termination, and avoid having to think about denormals, and get the log-friendly output range (0, 1]. One way of implementing this in code is:

```
double rand_between_zero_and_one() {
double d;
uint64_t x = rand_u64() >> 11; /* 53-bit uniform integer */
uint64_t e = 1022;
do {
if (rand_u64() & 1) break; /* 1-bit uniform integer */
e -= 1;
} while (e > 1022-75);
x = ((x + 1) >> 1) + (e << 52);
memcpy(&d, &x, sizeof(d));
return d;
}
```

The above throws away lots of random bits, whereas it would be more economical to remember and use them later:

```
double rand_between_zero_and_one() {
double d;
uint64_t x = rand_u64(); /* 53-bit and 11-bit uniform integers */
uint64_t bits = x & 0x7ff, nbits = 11;
uint64_t e = 1022;
do {
if (bits & 1) break; /* 1-bit uniform integer */
bits >>= 1;
if (--nbits == 0) bits = rand_u64(), nbits = 64;
e -= 1;
} while (e > 1022-75);
x = (((x >> 11) + 1) >> 1) + (e << 52);
memcpy(&d, &x, sizeof(d));
return d;
}
```

Finally, the loop can be eliminated using bit-counting intrinsics:

```
double rand_between_zero_and_one() {
double d;
uint64_t x = rand_u64();
uint64_t e = __builtin_ctzll(x) - 11ull;
if ((int64_t)e >= 0) e = __builtin_ctzll(rand_u64());
x = (((x >> 11) + 1) >> 1) - ((e - 1011ull) << 52);
memcpy(&d, &x, sizeof(d));
return d;
}
```

Or in Go rather than C:

```
// toFloat64 -> (0,1]
func toFloat64(seed *uint64) float64 {
x := splitmix64(seed)
e := bits.TrailingZeros64(x) - 11
if e >= 0 {
e = bits.TrailingZeros64(splitmix64(seed))
}
x = (((x >> 11) + 1) >> 1) - ((uint64(int64(e)) - 1011) << 52)
return math.Float64frombits(x)
}
```

With that exposition done, I can now present my criteria for assessing the quality of functions that claim to generate uniformly-distributed IEEE-754 double-precision floating-point values over (0, 1] or [0, 1] or [0, 1):

- What is the probability of seeing exactly zero? The rounding basin of 0 is only 2
^{-1075}wide, so the probability should be as close as possible to this. Note that probability zero (i.e. impossible) is*extremely*close to this. - What is the smallest non-zero value that can be seen? Smaller is better.
- What is the largest double-precision value less than 1 that
*cannot*be seen? Smaller is better (and negative is best). - How many distinct double-precision values can be seen? Larger is better.

Criteria | Most functions | Presented here | Ideal |
---|---|---|---|

Probability of zero | 2^{-52} or 2^{-53} | 0 | 2^{-1075} |

Smallest non-zero | 2^{-52} or 2^{-53} | 2^{-76} | 2^{-1074} |

Largest non-seeable | 1 - 2^{-53} or 0.5 - 2^{-54} | 2^{-76} - 2^{-129} | -0 |

Distinct values | 2^{52} or 2^{53} | 2^{58.2} | 2^{62}±1 |

If you've reached this far, then you might be interested in similar commentary from Taylor R. Campbell or Allen B. Downey or @moon_chilled.