Pseudo-random number: methods of obtaining, advantages and disadvantages

Table of contents:

Pseudo-random number: methods of obtaining, advantages and disadvantages
Pseudo-random number: methods of obtaining, advantages and disadvantages
Anonim

A pseudo-random number is a special number generated by a special generator. The Deterministic Random Bit Generator (PRNG), also known as the Deterministic Random Bit Generator (DRBG), is an algorithm for generating a sequence of numbers whose properties approximate the characteristics of random number sequences. The generated PRNG sequence is not truly random, as it is entirely determined by a seed value called the PRNG seed, which may include truly random values. Although sequences that are closer to random can be generated using hardware random number generators, pseudo-random number generators are important in practice for the speed of number generation and their reproducibility.

Number randomization
Number randomization

Application

PRNGs are central to applications such as simulation (eg for Monte Carlo), electronic games (eg for procedural generation), and cryptography. Cryptographic applications require that the outputthe data was not predictable from earlier information. More complex algorithms are required that do not inherit the linearity of simple PRNGs.

Terms and Conditions

Good statistical properties are a central requirement for obtaining a PRNG. In general, careful mathematical analysis is needed to be sure that the RNG generates numbers that are close enough to random to be appropriate for the intended use.

John von Neumann warned against misinterpreting PRNG as a truly random generator and joked that "Anyone who considers arithmetic methods for generating random numbers is certainly in a state of sin."

Use

PRNG can be launched from an arbitrary initial state. It will always generate the same sequence when initialized with this state. The PRNG period is defined as follows: maximum over all initial states of the length of the non-repeating sequence prefix. The period is limited by the number of states, usually measured in bits. Because the period length potentially doubles with each "state" bit added, it is easy to create PRNGs with periods large enough for many practical applications.

Large randomization plots
Large randomization plots

If the internal state of the PRNG contains n bits, its period can be no more than 2n results, it is much shorter. For some PRNGs, the duration can be calculated without bypassing the entire period. Linear Feedback Shift Registers (LFSRs) are typicallyare chosen so as to have periods equal to 2n − 1.

Linear congruential generators have periods that can be calculated using factoring. Although the PPP will repeat its results after they reach the end of the period, a repeated result does not mean that the end of the period has been reached, since its internal state may be greater than the output; this is especially evident for PRNGs with single bit output.

Possible errors

Errors found by defective PRNGs range from subtle (and unknown) to obvious ones. An example is the RANDU random number algorithm, which has been used on mainframes for decades. It was a serious shortcoming, but its inadequacy went unnoticed for a long period of time.

The operation of the number generator
The operation of the number generator

In many areas, research studies that have used random selection, Monte Carlo simulations, or other methods based on RNG are much less reliable than could be the result of poor quality GNPG. Even today, caution is sometimes required, as evidenced by the warning in the International Encyclopedia of Statistical Science (2010).

Successful case study

As an illustration, consider the widely used Java programming language. As of 2017, Java still relies on the Linear Congruential Generator (LCG) for its PRNG.

History

The first PRNG to avoid serious problems and still run pretty fast,was the Mersenne Twister (discussed below), which was published in 1998. Since then, other high quality PRNGs have been developed.

Generation Description
Generation Description

But the history of pseudo-random numbers does not end there. In the second half of the 20th century, the standard class of algorithms used for PRNGs included linear congruential generators. The quality of the LCG was known to be inadequate, but better methods were not available. Press et al (2007) described the result as follows: "If all scientific papers whose results are in doubt because of [LCGs and related] disappeared from library shelves, there would be a gap the size of your fist on every shelf."

The main achievement in the creation of pseudo-random generators was the introduction of methods based on linear recurrent in a two-element field; such oscillators are coupled to linear feedback shift registers. They served as the basis for the invention of pseudo-random number sensors.

In particular, the 1997 invention by Mersen Twister avoided many of the problems with earlier generators. The Mersenne Twister has a period of 219937−1 iterations (≈4.3 × 106001). It has been proven to be uniformly distributed in (up to) 623 dimensions (for 32-bit values), and at the time of its introduction was faster than other statistically sound generators that produce pseudo-random number sequences.

In 2003, George Marsaglia introduced a family of xorshift generators also based on linear repetition. These generators are extremelyare fast and - combined with a non-linear operation - they pass rigorous statistical tests.

In 2006, the WELL generator family was developed. WELL generators in a sense improve the quality of Twister Mersenne, which has an overly large state space and very slow recovery from them, generating pseudo-random numbers with a lot of zeros.

Characterization of random numbers
Characterization of random numbers

Cryptography

PRNG suitable for cryptographic applications is called cryptographically secure PRNG (CSPRNG). The requirement for a CSPRNG is that an attacker who does not know the seed has only a marginal advantage in distinguishing the generator's output sequence from a random sequence. In other words, while a PRNG is only required to pass certain statistical tests, a CSPRNG must pass all statistical tests that are limited to polynomial time in seed size.

Although the proof of this property is beyond the current level of computational complexity theory, strong evidence can be provided by reducing the CSPRNG to a problem that is considered hard, like integer factorization. In general, years of review may be required before an algorithm can be certified as a CSPRNG.

It has been shown that it is likely that the NSA inserted an asymmetric back door into the NIST-certified Dual_EC_DRBG pseudo-random number generator.

BBS generator
BBS generator

Pseudo-random algorithmsnumbers

Most PRNG algorithms produce sequences that are evenly distributed by any of several tests. This is an open question. It is one of the central in the theory and practice of cryptography: is there a way to distinguish the output of a high-quality PRNG from a truly random sequence? In this setting, the resolver knows that either a known PRNG algorithm was used (but not the state it was initialized with), or a truly random algorithm was used. He must distinguish between them.

The security of most cryptographic algorithms and protocols that use PRNGs is based on the assumption that it is impossible to distinguish between the use of a suitable PRNG and the use of a truly random sequence. The simplest examples of this dependency are stream ciphers, which most often work by omitting or sending the plaintext message with a PRNG output, producing the ciphertext. Designing cryptographically adequate PRNGs is extremely difficult as they must meet additional criteria. The size of its period is an important factor in the cryptographic suitability of a PRNG, but not the only one.

Pseudo-random numbers
Pseudo-random numbers

An early computer PRNG proposed by John von Neumann in 1946 is known as the mean squares method. The algorithm is as follows: take any number, square it, remove the middle digits of the resulting number as a "random number", then use this number as the starting number for the next iteration. For example, squaring the number 1111 gives1234321, which can be written as 01234321, an 8-digit number is the square of a 4-digit number. This gives 2343 as a "random" number. The result of repeating this procedure is 4896, and so on. Von Neumann used 10 digit numbers, but the process was the same.

Disadvantages of the "middle square"

The problem with the "mean square" method is that all sequences eventually repeat, some very quickly, for example: 0000. Von Neumann knew about this, but he found an approach sufficient for his purposes, and worried that the math "corrections" would just hide the errors instead of removing them.

The essence of the generator
The essence of the generator

Von Neumann found hardware random and pseudo-random number generators unsuitable: if they don't record the generated output, they can't be checked for errors later. If they were to write down their results, they would exhaust the computer's limited available memory and thus the computer's ability to read and write numbers. If numbers were written on cards, they would take much longer to write and read. On the ENIAC computer that he used, the "middle square" method and carried out the process of obtaining pseudo-random numbers several hundred times faster than reading numbers from punched cards.

Mean square has since been superseded by more complex generators.

Innovative method

A recent innovation is to combine the mean square with the Weil sequence. This method ensures high quality products withinlong period. It helps to get the best pseudo-random number formulas.

Recommended: