Using random numbers for simulations: Random.nextGaussian()
The random numbers that we've been generating so far have all
had a linear distribution. In other words, if we call say, nextInt(1000),
every integer between 0 and 999 inclusive will have a roughly equal chance of being
returned. In many cases, this is precisely what we want. But in some cases,
we want to use a random number generator to simulate an event where the possible values/outcomes
don't have equal probability of occurring. This situation crops up frequently
in testing and simulation applications.
For example, imagine we have some server program that receives packets (byte arrays)
over the network and passes them off to different threads, which stick them together into
a "command" and execute them. Networks being what they are, we don't always get a whole "command"
in one go. So, threads will sometimes be woken up with a packet, add it
to their queue, then go back to sleep; at other times, they'll we woken up with the "last"
packet in their queue and will execute the function in question, thus hogging a processor for
longer. In other words, we have an essentially "complex" situation with some randomness
in what happens. Now, our problem is that we want to find out how our server will scale
in the future— say, if it received twice or ten times the volume of packets per second.
One way we can do this is to run a simulation: we write a method that,
with some frequency that we want to test, makes
up random packets of data and injects them into our server's receivePacket()
method. We know
that in real life, the packets are "random" in size and arrive at "random" intervals,
so we also want to simulate this randomness. As
a first attempt, we could measure the average length of a packet and average number of milliseconds
between packets as they occur in real life. Then we can write a simulation that picks
random numbers around this range, but (say) halves the average number of milliseconds between packets.
To pick the random numbers, we could use nextInt().
The main problem with this approach is that nextInt() does not simulate
how values differ from the average. For example, we might measure the average
number of milliseconds between packets to be around 500. But calling nextInt(1000) to
simulate this average duration is unrealistic. The sequences (100, 800, 200, 900) and
(450, 550, 580, 420) both have an average duration of 500; but common sense (or measurement
of the network behaviour) tells us that the second sequence is much more likely in practice:
observed durations tend to cluster around the average. In other words, durations
don't occur with equal likelihood.
This is where the Random class's nextGaussian method comes in. It generates
random numbers that "cluster naturally" around an average. Mathematically, it creates
random numbers with a normal distribution.