Random numbers are a sequence of numbers chosen such that the occurrence of any number has the probability specified by some distribution. When this random numbers are generated in large number, it reproduces the underlying distribution. It is the realization of a random variable following some specific distribution. These random numbers are very important in simulating a physical situation. Though the numbers generated using the computers are termed as random numbers as they are not really random. These numbers are generated from some deterministic procedure. They are commonly called pseudo random numbers. The generation of true random numbers is not an easy task and will take much time to generate. Moreover we may need the same numbers to generate again and again to check whether the model/situations we developed is true or not.
General structure of a PSRN consist of two functions. There is a finite set of sets and a function \(f:S\rightarrow S\) and an output function \(g:S \rightarrow U\) where \(U =(0,1)\). The generator will start with an initial value \(s_0\), called the seed. Normally the user inputs the seed. The facility of seed is very important for the reproducibility of a random number sequence. In other words random number sequence generated by a PRNG will be the same for same value of seed, no matter the number of repetition or the time of repetition. Then the random numbers can be generated by defining
\[\begin{eqnarray} s_n=f(s_{n-1}), \; u_n=g(s_n),\; n=1,2,3,\cdots \end{eqnarray}\] Since the state space is finite, the sequence will start regenerating after sometime.The smallest integer \(p\) such that for the random number series starts regenerating is called the period of the generator. Obviously, longer periods are better than short periods. The following are some of the commonly used PRNGs
The general form of a linear congruential generator is given by \[\begin{eqnarray} f=s_n=a\times s_{n-1}+c \mod m \; n=1,2,3,\cdots \end{eqnarray}\] where \(a\) and \(c\) are constants which one shall use. The state space \(S\) is given by \(\{0,1,2,\cdots,m-1\}\). The output function is given by \[\begin{equation} u_n=\frac{s_n}{m}, \; n=1,2,3,\cdots \end{equation}\] The value we choose to trigger the generation, \(s_0\) is called the seed. The choice of \(a\) and \(c\) decides the quality of random numbers generated and different methods use different values for \(a\) and \(c\)
The state space is given by \(\{0,1,2,\cdots,m-1\}^k\) and define the function \(f\) by \[\begin{eqnarray} s_n= (a_1\,x_{n-1}+a_2\,x_{n-2}+\cdots+a_k\,x_{n-k}) \mod m \end{eqnarray}\] Then the output is given by \[\begin{equation} u_n=\frac{s_n}{m}, \; n=1,2,3,\cdots \end{equation}\]
Another attempt to bring more randomness in the model is by combining the several random number generators. For example, Wichman-Hill generator which combines three linear congruential generators. the state space is given by \(\{0,1,2,\cdots, m_1\} \times \{0,1,2,\cdots, m_2\} \times \{0,1,2,\cdots, m_3\}\). The state space at \(n^{th}\) step is the triplet \((x_n,y_n,z_n)\) and the components are given by \[\begin{eqnarray} x_n=171\;x_(n-1) \mod m_1 y_n=172\;y_(n-1) \mod m_2 z_n=170\;z_(n-1) \mod m_3 \end{eqnarray}\] The output function is given by \[\begin{equation*} u_n=\left(\frac{x_n}{m_1}+\frac{y_n}{m_2}+\frac{z_n}{m_3}\right) \mod 1 \end{equation*}\]
The \(n^{th}\) term in the Fibonacci series is defined as the sum of the previous two terms. PRNG lagged Fibonacci generator uses the function as \[\begin{eqnarray} s_n&=&s_{n-1} \otimes s_{n-2} \mod m\\ u_n&=& \frac{s_n}{m} \end{eqnarray}\]
The functions that generate random numbers according to the Blum Blum Shub PRNG are \[\begin{eqnarray} s_n&=&s_{n-1}^2 \mod m\\ u_n&=& \frac{s_n}{m} \end{eqnarray}\] where \(m=pq\) is the product of two large primes \(p\) and \(q\). The seed \(x_0\) should be an integer that is co-prime to \(m\)
PRNG normally generate uniform random numbers. But in real life situations we need random numbers which follows some specific statistical distribution. The following are some of the commonly used methods for generating random numbers of specific statistical distributions.
Inverse transformation method is probably the easiest method to generate random numbers of non-unform random numbers. This method can be used very effectively when the distribution function of the random variable has compact form for the distribution function. The method is based on the following result\ For any continuous random variable \(X\), the distribution function \(F(\cdot) \sim U(0,1)\)\ Given that \(X\) is a continuous random variable with distribution function \(F(\cdot)\). Let \(Y=F(x), X\in \mathbb{R}\) We can see that the transformation \(Y= F(x), X\in \mathbb{R}\) is a one-to-one transformation. Hence we can use the Jacobian tranformation to find the density function of the random variable \(Y\). \[\begin{eqnarray} \frac{dY}{dX}&=& \frac{dF(x)}{dx}=f_x(x)\\f_Y(y)&=&f_X(x \text{ in terms of }y) \left|\frac{dx}{dy}\right|=f_x(x)\times \frac{1}{f_X(x)}=1,; 0\leq x\leq 1 \end{eqnarray}\] Random numbers for the distribution having a compact form for the distribution function can be generated using the following steps
Let \(X\) be a random variable following uniform distribution with parameters \(a\) and \(b\), ie., \(X \sim U(a,b)\). Then the distribution function is given by \[ F(x)=\begin{cases} 0 & x \leq a\\\frac{x-a}{b-a}, &a \leq x \leq b\\ 0 & x\geq b \end{cases}\]
Then the inverse between the range \(a\leq x\leq b\) is given by \[\begin{eqnarray*} u=F(x) &\implies& u= \frac{x-a}{b-a} \implies x-a= u(b-a) \implies x=a+u(b-a) \end{eqnarray*}\]
import numpy as np
# Generate five uniform random numbers between 0 and one
u=np.random.rand(5)
#a=eval(input("Enter the lower limit of the uniform random distribution))
a=5
# b=eval(input("Enter the upper limit of the uniform random distribution))
b=10
if b>a:
u=a+u*(b-a)
print(u)
## [5.83579203 6.83911934 7.76779047 7.68911381 6.8919588 ]
Now we shall compare the generated random numbers with the theoretical distribution using the code given below:
Let \(X\) be random variable following exponential distribution with mean \(\frac{1}{\lambda}\), ie, \(f(x)=\lambda e^{-\lambda x}, x \geq 0\). Then the distribution function when \(x\geq 0\) is given by \[\begin{eqnarray*} F(x)=\int_{0}^x \lambda e^{-\lambda x} dx = \lambda \;\left[\frac{ e^{-\lambda x}}{-\lambda}\right]_0^x= 1-e^{-\lambda x} \end{eqnarray*}\] Then the inverse over the range \(x\geq 0\) is given by \[\begin{eqnarray*} u=F(x) &\implies& u= 1-e^{-\lambda x}\implies e^{-\lambda x}= u-1 \\&\implies &-\lambda x= \log(1-u) \implies x= \frac{- \log(1-u)}{\lambda} \end{eqnarray*}\] If \(U\sim U(0,1)\), \(1-U\) also follows \(U(0,1)\), hence the inverse can be rewritten as \(x= \frac{- \log u}{\lambda}\)
import numpy as np
import matplotlib.pyplot as plt
u=np.random.rand(5)
# lamb=eval(input("Enter the rate of exponential distribution))
lamb=3.3
exp=(-1)*lamb*np.log(u)
print(exp)
## [2.07620504 5.42369893 0.45859884 6.10527356 8.78478261]
In the case of discrete random variable, we do not have the basic result that distribution function will follow normal distribution. Hence we will use the following algorithm to generate discrete uniform numbers
Let $ X $ be a discrete random variable with pdf given by \[ f(x)=\begin{cases} 0.2 & \text{ when } x=0\\ 0.3 & \text{ when } x=2 \\ 0.4 & \text{ when } x=6\\ 0.1 & \text{ when } x=30 \end{cases}\] The distribution function corresponding to the random variable is given as \[ F(x)=\begin{cases} 0 & \text{ when } x<0\\ 0.2 & \text{ when } 0 \leq x<2\\ 0.5 & \text{ when } 2 \leq x<6 \\ 0.9 & \text{ when } 6 \leq x<30\\ 1 & \text{ when } x\geq 30 \end{cases}\] Now we generate uniform random number and let it be 0.33896, we can see that the value of the distribution function which random variable for which distribution function crosses this value for the first time is given by random number and here it is 2.
import numpy as np
x=[0,2,6,30] # Possible values of the original random variable
pdf=[0.2,0.3,0.4,0.1]
# pdf of the original random variable
rn=[] # defining list for the RNs of objective distribution
if len(x)==len(pdf) and sum(pdf)==1:
df=[pdf[0]]
for i in range(1,len(pdf)):
df.append(df[i-1]+pdf[i])
# df will contain the distribution function of the objective distribution
u=np.random.rand(5) #standard uniform random numbers
for i in range(0,len(u)):
if u[i]<=df[0]:
rn.append(x[0])
else:
for j in range(0,len(df)):
if u[i]>df[j] and u[i]<=df[j+1]:
rn.append(x[j+1])
break
print("x=", x,"\n Distribution function=",df)
print("Uniform RN =",end=" ")
for k in u:
print("{:6.4f}".format(k),end=",")
print("\nObjective RNs=",rn)
else:
print("Invalid pdf or invalid length for pdf")
## x= [0, 2, 6, 30]
## Distribution function= [0.2, 0.5, 0.9, 1.0]
## Uniform RN = 0.9441,0.6951,0.5671,0.2386,0.3045,
## Objective RNs= [30, 6, 6, 2, 2]
The pdf of the Bernoulli distribution is given by \[ f(x)=\begin{cases} 1-p & \text{ when } x=0\\ p & \text{ when } x=1 \end{cases}\] The distribution function corresponding to the random variable is given as \[ F(x)=\begin{cases} 0 & \text{ when } x<0\\ 1-p & \text{ when } 0 \leq x<1\\ 1 & \text{ when } x\geq 1 \end{cases}\]
Python program to generate Bernoulli random variable is given by
Now generate the uniform random numbers \(u_i\). Then the Bernoulli random numbers are given by \(x_i=\begin{cases} 0 & \text{ if }u_i \leq 1-p\\ 1 & \text{ if }1-p<u_i \leq 1 \end{cases}\)
Now generate the uniform random numbers \(u_i\). Then the Bernoulli random numbers are given by \(x_i=\begin{cases} 0 & \text{ if }u_i \leq 1-p\\ 1 & \text{ if }1-p<u_i \leq 1 \end{cases}\)
import numpy as np
u=np.random.rand(5)
rn=[]
#p=eval(input("Enter the probability of success"))
p=0.38
if 0<=p<=1:
for i in u:
if i<=p:
rn.append(0)
else:
rn.append(1)
print("Uniform RN =", end=" ")
for k in u:
print("{:6.4f}".format(k), end=",")
print("\nObjective RNs=", rn)
else:
print("Enter valid probability of success")
## Uniform RN = 0.8564,0.8088,0.1534,0.6661,0.3866,
## Objective RNs= [1, 1, 0, 1, 1]
Consider \(G(x) =1-e^{-\lambda x}\), \[\begin{eqnarray*} G(x+1)-G(x)&=&\left[ 1- e^{- \lambda (x+1)}\right]- \left[1- e^{- \lambda x}\right] = e^{-\lambda x} - e^{-\lambda (x+1)}\\ &=& e^{-\lambda x} \left[1- e^{-\lambda}\right]\end{eqnarray*}\] The RHS can be written in the form \((1-p)^x\;p\) where \(p = 1- e^{-\lambda}\). Hence the RHS is the pdf of geometric distribution Hence $ G(x) $, for integer values of $ x $, is the distribution function of geometric random variable with parameter \(1- e^{-\lambda}\) Since distribution function lies between \(0\) and \(1\), we can generate its values with \(u\), standard uniform random number Also \(u = 1-e^{-\lambda x} \implies e^{-\lambda x} = 1-u \implies x=\frac{-\log u}{\lambda}\)
Since geometric random variable can be integers only and we are dealing with distribution function, we round the value to the integer just above Hence \(\lceil\frac{ -\log u }{\lambda}\rceil\) follows geometric distribution with parameter \(p=1-e^{-\lambda}\) $ p=1-e^{-} = -p$Hence \(\lceil\frac{ \log u }{\log (1-p)}\rceil\) follows geometric distribution with parameter \(p\)
import numpy as np
#p=eval(input("Enter the probability of success: ")
p=0.3
if 0<p<1:
u=np.random.rand(5)
geo=np.ceil(np.log(u)/np.log(1-p))
print(geo)
else:
print("Enter a valid probability!")
## [1. 1. 2. 5. 3.]
This method is applied when a random variable can be expressed as a sum of independent random variable for which random numbers can be generated. Let \(X\) be a random variable which can be expressed as \(X=Y_1+Y_2+\cdots+Y_n\) where random numbers of \(Y_i, i=1,2,\cdots, n\) can be generated. The steps in the method is given as follows 1. Generate uniform random variable 2. Generate independent random numbers of the random variable \(Y_i,i=1,2,\cdots, n\) 3. Generate the random numbers of \(X\) as \(x=y_1+y_2+\cdots+y_n\)
Erlang Distribution is defined as the sum of integer number of Exponential random variables.Erlang distribution is the Gamma distribution with integer values for the shape parameter. Let us consider that \(X\) is a random variable following Erlang distribution with shape parameter \(k\) and scale parameter \(\lambda\) \(f(x) =\frac{\Gamma(k)}{(\lambda)^k}\; x^{k-1}\, e^{-\frac{x}{\lambda}}\) Random numbers of the Erlangian Distribution can be generated using the following steps.
The python code for generating Erlangian random numbers is given as follows
import numpy as np
#lamb=eval(input("Enter the value of scale parameter: "))
#shape=eval(input("Enter the value of scale parameter: "))
lamb=4
shape=6
if shape==int(shape):
u=np.random.rand(5,shape)
# we are generating 5 rows of 6 uniform random numbers
exp=(-1)*np.log(u)/lamb
# exp will be 5 rows of 6 exponential random numbers
erla=exp.sum(axis=1)
#sum is taken over the columns
#axis=0 can be used to take sum over rows
print(erla)
## [0.85805126 1.2477499 1.41597485 1.09134391 2.87800036]
We know that binomial distribution is the sum of Bernoulli random numbers. Hence we use the Bernoulli random numbers to generate Binomial random numbers.
import numpy
import numpy as np
#p=eval(input("Enter the value of probability of success: "))
#n=eval(input("Enter the number of repetitions of Bernoulli Experiement: "))
#print=eval(input("Enter the number of Random numbers to be generated:")
N=5
p=0.6
n=10
ber=numpy.empty(shape=(N,n))
if 0<p<1 and n==int(n):
u=np.random.rand(N,n)
for i in range(0,N):
for j in range(0,n):
if u[i,j]<=1-p:
ber[i,j]=0
else:
ber[i,j]=1
bino=ber.sum(axis=1)
print(bino)
## [8. 3. 4. 6. 8.]
Let \(X\) be the random variable with the desnity \(f(\cdot)\) for which we want to generate the random variable. In order to generate random numbers using accepatnce rejection method, we have to consider another random variable \(Y\) having the density function \(g(\cdot)\) such that \(f(x) \leq c\,g(x) \;\forall \; x \in \mathbf{R}\). The density function \(g(\cdot)\) is selected in such a way that the associated random numbers can be generated easily generated. The acceptance rejection algorithm for the generation of random numbers is as follows
Theorem 1: The random variable \(X\) generated by this algorithm has density \(f(\cdot)\) . \[\begin{eqnarray*} P(X\leq x)&=& P(Y\leq x \vert Y \text{ is accepted}) \\ P(Y\leq x, Y \text{ is accepted}) &=& P(Y\leq x, U \leq f(Y)/c\;g(Y)) \\ &=& \int_{-\infty}^{x} \int_{0}^{f(y)/c\;g(y)}1 du\; g(y)\, dy \\&=& \int_{-\infty}^x f(y)/c\;g(y) g(y) dy= \frac{1}{c} \int_{-\infty}^x f(y)dy\\ P(Y \text{ is accepted}) &=& P\left(U \leq f(Y)/c\;g(Y)\right) \\&=&= \int_{-\infty}^{\infty} \int_{0}^{f(y)/c\;g(y)}1 du g(y)\, dy \\&=& \int_{-\infty}^{\infty} f(y)/c\;g(y) g(y) dy= \frac{1}{c} \int_{-\infty}^{\infty} f(y)dy\\ &=& \frac{1}{c} \\ \therefore P(X\leq x)&=& \dfrac{\frac{1}{c} \int_{-\infty}^x f(y)dy}{\frac{1}{c}}= \int_{-\infty}^x f(y)dy \end{eqnarray*}\] Hence the density function of \(X\) is \(f(\cdot)\).
Hence it is worthy to note the following points.
Since \(f(\cdot)\) and \(g(\cdot)\) are proper density functions, total area under both curves will be one.
\[\begin{eqnarray*} f(x)\leq c g(x) &\implies& \int_{-\infty}^{\infty} f(x) dx \leq \int_{-\infty}^{\infty} c g(x) dx\\ &\implies& \int_{-\infty}^{\infty} f(x) dx \leq c \int_{-\infty}^{\infty} g(x) dx\\ &\implies& 1 \leq c \times 1 \implies c\geq 1 \end{eqnarray*}\]
Every random number for the random variable \(Y\) need not be accepted as a random number for \(X\). A random number of \(Y\) can be accepted as random number for \(X\) with probability \(\frac{1}{c}\)
The number of trials required for the getting the first random number following the desnity function \(f(\cdot)\) is geometrically distributed with proability \(p=\frac{1}{c}\)
The number of random numbers generated for the random variable \(Y\) for getting \(n\) random numbers of the random variable \(X\) is negative binomial random variable
One of the factor which affects the efficiency of of the method is \(P(Acceptance)=\frac{1}{c}\)
Efficiency is maximum when \(c\) is close to \(1\). Hence the selection of \(g(\cdot)\) is important.
Another parameter which affects the efficiency is speed of generation of \(Y\) from \(g(y)\)
Our objective is to generate normal random numbers with mean \(\mu\) and variance \(\sigma^2\). The method we consider here generates the the normal random variables in the following steps.
import numpy as np
import matplotlib.pyplot as plt
x=np.linspace(0,15,500)
y=np.sqrt(2/np.pi)*np.exp((-1)*(x**2)/2)
z=np.exp((-1)*x)
#fig=plt.subplots(ncols=1, nrows=1)
plt.plot(x,y,label="Absolute Standard Normal Distribution")
plt.plot(x,z,label="Standard Exponential Distribution")
plt.ylabel("Density")
#plt.ylim((0,1))
plt.legend()
plt.show()
We can see that the two distribution are similar in shape but in certain interval density of absolute standard normal distribution exceeds that of standard expoential distribution. Hence we will try to find out the value of the constant \(c\) such that \(f_x(y) \leq c\; g(y),\; \forall\; x \in \mathbf{R}\). Assume \(h(x)=\frac{f(x)}{g(x)}= e^{x-\frac{x^2}{2}} \sqrt{\frac{2}{\pi}}\). Maximum value of this can be achieved by solving \(h'(x)=0\) and checking whether \(h''(x)<0\) at the point where \(h'(x)=0\). It can be seen that \(h(x)\) is maximum at \(x=1\) and the maximum value is \(c=\sqrt{2e/\pi}\). Hence \(\frac{f(x)}{g(x)} \leq \sqrt{2e/\pi} \implies f(x) \leq \sqrt{2e/\pi} \; g(x)\). \(\frac{f(y)}{c\,g(y)}=e^{-\frac{(y-1)^2}{2}}\)
import numpy as np
import matplotlib.pyplot as plt
x=np.linspace(0,15,500)
y=np.sqrt(2/np.pi)*np.exp((-1)*(x**2)/2)
z=np.sqrt(2*np.exp(1)/np.pi)*np.exp((-1)*x)
#fig=plt.subplots(ncols=1, nrows=1)
plt.plot(x,y,label="Absolute Standard Normal Distribution")
plt.plot(x,z,label=r'$\sqrt{2e/\pi}e^{-x}$')
#plt.ylim((0,1))
plt.title("Acceptance Rejection Method")
plt.legend()
plt.show()
Now we will generate a standard uniform random number and if it is less than \(e^{-\frac{(y-1)^2}{2}}\), we will accept the corresponding standard exponential random number as a random number of absolute standard normal random number.
The following is the python program to generate absolute standard normal random numbers
import numpy as np
i=0
np.random.seed(123)
#N=eval(input("Enter the number of Random Numbers: "))
N=5 # for the time being we are generating 5 random numbers
abz=[]# list of the absolute standard normal random numbers generated
y=[] # list of the exponential random numbers generated
n=5
#Here we have to use while loop not the for loop as the number of exponential variables to be generated so that we have 5 absolute standard normal random numbers cannot be fixed in advance
while (i<n):
u=np.random.rand()
expo=(-1)*np.log(u)
u2=np.random.rand()
y.append(expo)
# next we check whether the condition is satisfied
if u2<=np.exp((-1)*((expo-1)**2)/2):
abz.append(expo)
i=i+1
print(abz)
## [0.3617317285243736, 1.4834598652920619, 0.3292418810544136, 0.7320295958549297, 1.069505968924431]
print(y)
## [0.3617317285243736, 1.4834598652920619, 0.3292418810544136, 0.01942321692908894, 0.7320295958549297, 1.069505968924431]
To generate this, we will first random numbers from standard uniform distribution. If the standard uniform random number is less than \(0.5\), the associated random number is \(-1\) else the associated random number is \(+1\)
import numpy as np
u=np.random.rand(n)
s=[]
for i in range(0,n):
if u[i]<=0.5:
s.append(-1)
else:
s.append(1)
print(s)
## [-1, -1, -1, 1, -1]
norm=[abz[i]*s[i] for i in range(0,len(abz))]
print(norm)
## [-0.3617317285243736, -1.4834598652920619, -0.3292418810544136, 0.7320295958549297, -1.069505968924431]
#mu=eval(input("Enter the mean of the distribution))
#sigma=eval(input("Enter the standard deviation of the distribution))
mu=10
sigma=6
normrn=[sigma*norm[i]+mu for i in range(0,len(norm))]
print(normrn)
## [7.829609628853758, 1.0992408082476288, 8.024548713673518, 14.392177575129578, 3.582964186453413]
Beta distribution is included as a special case of the situations where the random variable has finite range. Let us consider the case where the random variable \(X\) has the density given by \(f(x)=\dfrac{1}{\beta(p,q)}x^{p-1}\;(1-x)^{q-1}\; 0\leq x\leq 1\). We shall consider standard uniform distribution as the alternate distribution ie \(g(x)=1,0\leq x\leq 1\). Let us consider the case where \(p,q>1\) for the beta distribution. Then the mode of the beta distribution is given by \(\frac{p-1}{p+q-2}\). Substituting this value in the desnity function we have \[\begin{eqnarray}f(x)&\leq& \dfrac{1}{\beta(p,q)}\left(\frac{p-1}{p+q-2}\right)^{p-1}\;\left(1-\frac{p-1}{p+q-2}\right)^{q-1}\\ &\leq & \dfrac{1}{\beta(p,q)}\left(p-1\right)^{p-1}\;\left(q-1\right)^{q-1}\frac{1}{(p+q-2)^{p+q-2}}\end{eqnarray}\]
Assuming \(c=\dfrac{1}{\beta(p,q)}\left(p-1\right)^{p-1}\;\left(q-1\right)^{q-1}\frac{1}{(p+q-2)^{p+q-2}}\), we have \(f(y)\leq c\;g(y)\)
Hence the steps in the generation of beta distribution can be summarised as follows