• Category
  • >Python Programming

Working With Random Numbers in Python: Random Probability Distributions

  • Lalit Salunkhe
  • Jul 20, 2021
Working With Random Numbers in Python: Random Probability Distributions title banner

Introduction

 

Randomness is the soul of statistics, and by far, statistics play an important role in the development of data science and machine learning concepts. For example, we generate random samples, we assign random weights to artificial neural networks, we also split the data randomly into test and training datasets, and many more concepts from data science require random numbers and random samples. 


 

In this article, we will walk you through generating random samples from different probability distributions and work with them. After completing this tutorial article, you will be able to understand how random samples can be generated through different probability distributions (discrete and continuous) as well as you will learn some additional things such as plotting the sampled random distributions. 

 

(Must read: DATA TYPES in Python)

 

 

The random and scipy module to generate random samples

 

As my previous article also introduces, the random module/library is important to generate random numbers and random samples from different probability distributions (mostly continuous ones). You can read the article Working with Random Numbers in Python for connecting the dots from this article. 

 

Besides, we are introducing a new module scipy.stats to generate random samples from discrete distributions such as poison, binomial, etc. Learn all types of data distribution models by following the link.

 

Importing these two modules along with the pyplot from matplotlib is simple and as shown below:

 

#importing random module in python environment

import random as rnd



#Importing scipy module in python environment

import scipy.stats as scpy



#Importing matplotlib module in python environment

import matplotlib.pyplot as plt

 

The matplotlib.pyplot will help us in visualizing the distributions of random samples we are going to take.

 

(Also read: First Step Towards Python)

 

 

Generating random sample from binomial distribution

 

Well, to generate a random sample from a binomial distribution, we can use the binom.rvs() method from the scipy.stat module. This method takes n (number of trials) and p (probability of success) as parameters along with the size. 

 

The size parameter allows you to restrict the sample points up to a specific number. 

 

The syntax for the binom.rvs() method is as shown below:

binom.rvs(n, p, size)



Where, 

n -  specifies the number of trials,

p - specifies the probability or chance of success

size - specifies the sample size default value as 1.

 

Now, let us take a simple example where we try to generate a random binomial sample of size 5, with parameters n = 12 and p = 0.6. Code is as shown below:

 

#importing the binom module from scipy.stats in python environment

from scipy.stats import binom



#Generating five random binomial numbers from a given distribution



for i in range(5):

    rnd_binom = binom.rvs(n = 12, p = 0.6)

    print(rnd_binom)

 

Now, if we run the code above, we see the output as shown below


These five numbers are random samples from a binomial distribution with parameters n = 12 and p = 0.6.

A random sample of five numbers from the binomial distribution


Note that, we could have used the size = 5 arguments and generate a random sample of five as well. However, it would have given us a list of five samples. 

 

Now let us try to generate a random sample of 10,000 items and plot it using the pyplot module to see the distribution of the binomial variate.

 

#importing the binom module from scipy.stats in python environment

from scipy.stats import binom



#importing pyplot module as plt from matplotlib in python environment

import matplotlib.pyplot as plt



#Generating a random sample of size 10000 from binomial distribution with n = 12 and p = 0.6

binom_rnd_sample = binom.rvs(n = 12, p = 0.6, size = 10000)



#Plotting the distribution using plt.hist method

plt.hist(binom_rnd_sample, bins = 50)

 

Here, we are generating a random sample of size 10,000 from a binomial distribution with n = 12 and p = 0.6. Then, the plt.hist() method is used to generate a histogram out of the sample created. See the output as shown below:


The distribution of 10,000 random numbers is drawn from a binomial random variable and then plotted using the plt.hist() method.

Plotting a random binomial sample of size 10,000


You can also see various distributional graphs if you change the values for n and p altogether.

 

(Suggested read: Julia vs Python)

 

 

Generating random sample from poisson distribution

 

The Poisson distribution is one of the important distributions in statistics and is often called the distribution of rare events. This distribution fits to model the number of events happening in a given time span. 

 

We have the poisson.rvs() method from the scipy.stats module which allows us to generate a Poisson random sample. This method takes the average event occurring rate (mu) at a given time, as usual size describes how many random variates can be captured through the distribution.

 

Let us see how to draw and plot a random sample from Poisson distribution in python.


#importing the poisson module from scipy.stats in python environment

from scipy.stats import poisson



#importing pyplot module as plt from matplotlib in python environment

import matplotlib.pyplot as plt



#Generating a random sample of size 10000 from poisson distribution with mean 4

pois_rnd_sample = poisson.rvs(mu = 4, size = 10000)



#Plotting the distribution using plt.hist method

plt.hist(pois_rnd_sample, bins = 50)

Here, we are generating a sample of 10,000 poisson random variates with a mean value of 4 and plotting those points to see if this sample follows the poisson properties. See the graph below:


Poisson random sample of 10,000 items with mu = 4.

A plot of 10,000 Poisson random variates with mean value 4


Generating random sample from normal distribution

 

Well, we can use the standard random module to generate a random sample from the normal distribution. We have a function called normalvariate(). To generate a random sample from normal distribution, it is mandatory to provide the mean (mu) and the standard deviation (sigma) value under the normalvariate() function. 

 

Let us generate a random sample of size 5 with mean zero and standard deviation 5. See the code below:


#Importing python module random to generate random numbers

import random as rnd



#Generating a random sample of 5 from normal distribution

for i in range(5):

    rnd_norm = rnd.normalvariate(mu = 0, sigma = 5)

    print(rnd_norm)

 

The output is as shown below:


Generating a random sample of 5 units from the normalvariate() function.

Random sample of 5 from the normal distribution with mean 0 and standard deviation 5


Well, interestingly, we can also draw a normal random sample through the scipy.stats module. The module has norm.rvs() method that allows us to generate a random sample from normal distribution. It has a loc parameter that specifies the mean value and scale parameter that specifies the sigma/standard deviation. Let us generate a random sample of size 10,000 and plot it. Code is as below:

 

#importing the norm module from scipy.stats in python environment

from scipy.stats import norm



#importing pyplot module as plt from matplotlib in python environment

import matplotlib.pyplot as plt



#Generating a random sample of size 10000 from binomial distribution with n = 12 and p = 0.6

normal_rnd_sample = norm.rvs(loc = 0, scale = 5, size = 10000)



#Plotting the distribution using plt.hist method

plt.hist(normal_rnd_sample, bins = 50)

 

The output plot of this code is as shown below:


the plot of 10,000 random normal variates with mean 0 and standard deviation 5.

Plotting random normal sample of 10,000 points with mean 0 and sigma 5


This is all we have for you in this article. If you have not checked our article about working with python JSON Objects, you can read it out here Working With Python JSON Objects. Closing this article with some summary points for you.


 

Summary

 

  1. The scipy.stats module from python is a rich source with most of the statistical functions present in it. We can use the same module to generate random samples from different statistical distributions (both continuous and discrete)

  2. The binom.rvs() method from the scipy.stat module is used to generate a random sample of any size from binomial distribution.

  3. The poisson.rvs() method from the scipy.stats module is used to generate a random sample of any size from poisson distribution.

  4. The normalvariate() method from module random can be used to generate a random sample of any size from Normal Distribution

  5. The norm.rvs() method from the scipy.stats module can be used to generate a random sample of any size from Normal Distribution.

Latest Comments