Definition of Statistics: The science of producing unreliable facts from reliable figures.-Evan Esa
A good yet sound understanding of statistical functions (background) is demanding, even of great benefit in everyday life. For example, the concept of data distribution where distributions are simply the population, holding scattered data.
Data distribution is a function that determines the values of a variable and quantifies relative frequency, it transforms raw data into graphical methods to give valuable information. It becomes substantial to understand the kind of distribution that a population has that assists in applying proper statistical techniques/methods.
On the other hand, when statisticians or data experts analyze datasets, the very first step is to conduct exploratory data analysis (EDA) for learning about characteristics of a specific feature in datasets that help in understanding any pattern present in the data distributions.
Through this way, they can tailor machine learning models suitable for particular case studies as ML models are designed under some data distribution assumptions.
Therefore, understanding certain types of statistical data distributions is necessary to assist in identifying which models are appropriate to use, and this is the main course of discussion through this blog.
(Related blog: Statistical data analysis)
This is one of the simplest distributions that can be used as an initial point to derive more complex distributions. Bernoulli’s distribution has possibly two outcomes (success or failure) and a single trial.
For example, tossing a coin, the success probability of an outcome to be heads is p, then the probability of having tail as outcome is (1-p). Bernoulli’s distribution is the special case of binomial distribution with a single trial.
The density function can be given as
f(x) = px (1-p)(1-x) where x € (0,1)
It can also be written as;
The graph of Bernoulli's distribution is shown below where the probability of success is less than probability of failure.
The distribution has following characteristics;
The number of trials, to be performed, need to be predefined for a single experiment.
Each trial has only two possible outcomes-success or failure.
The probability of success of each event/experiment must be the same.
Each event must be independent of each other.
(Read also: ANOVA test)
The binomial distribution is applied in binary outcomes events where the probability of success is equal to the probability of failure in all the successive trials. Its example includes tossing a biased/unbiased coin for a repeated number of times.
As input, the distribution considers two parameters, and is thus called as bi-parametric distribution. The two parameters are;
The number of times an event occurs, n, and
Assigned probability, p, to one of the two classes
For n number of trials, and success probability, p, the probability of successful event (x) within n trials can be determined by the following formula
The graph of binomial distribution is shown below when the probability of success is equal to probability of failure.
The binomial distribution holds the following properties;
For multiple trials provided, each trial is independent to each other, i.e, the result of one trial cannot influence other trials.
Each of the trials can have two possible outcomes, either success or failure, with probabilities p, and (1-p).
A total number of n identical trials can be conducted, and the probability of success and failure is the same for all trials.
Being a continuous distribution, the normal distribution is most commonly used in data science. A very common process of our day to day life belongs to this distribution- income distribution, average employees report, average weight of a population, etc.
The formula for normal distribution;
Where μ = Mean value,
σ = Standard probability distribution of probability,
x = random variable
According to the formula, the distribution is said to be normal if mean (μ) = 0 and standard deviation (σ) = 1
The graph of normal distribution is shown below which is symmetric about the centre (mean).
Normal distribution has the following properties;
Mean, mode and median coincide with each other.
The distribution has a bell-shaped distribution curve.
The distribution curve is symmetrical to the centre.
The area under the curve is equal to 1.
(Recommended blog: Types of statistical Analysis)
Being a part of discrete probability distribution, poisson distribution outlines the probability for a given number of events that take place in a fixed time period or space, or particularized intervals such as distance, area, volume.
For example, conducting risk analysis by the insurance/banking industry, anticipating the number of car accidents in a particular time interval and in a specific area.
Poisson distribution considers following assumptions;
The success probability for a short span is equal to success probability for a long period of time.
The success probability in a duration equals to zero as the duration becomes smaller.
A successful event can’t impact the result of another successful event
A poisson distribution can be modeled using the formula below,
Where 𝝺 represents the possible number of events take place in a fixed period of time, and X is the number of events in that time period.
The graph of poisson distribution is shown below;
Poisson distribution has the following characteristics;
The events are independent of each other, i.e, if an event occurs, it doesn’t affect the probability of another event occurring.
An event could occur any number of times in a defined period of time.
Any two events can’t be occurring at the same time.
The average rate of events to take place is constant.
Like the poisson distribution, exponential distribution has the time element; it gives the probability of a time duration before an event takes place.
Exponential distribution is used for survival analysis, for example, life of an air conditioner, expected life of a machine,and length of time between metro arrivals.
A variable X is said to possess an exponential distribution when
Where λ stands for rate and always has value greater than zero.
The graph of exponential distribution is shown below;
The exponential distribution has following characteristics;
As shown in the graph, the higher the rate, the faster the curve drops, and lower the rate, flatter the curve.
In survival analysis, λ is termed as a failure rate of a machine at any time t with the assumption that the machine will survive upto t time.
(Also read: Importance of Statistics in Data Science)
The multinomial distribution is used to measure the outcomes of experiments that have two or more variables. It is the special type of binomial distribution when there are two possible outcomes such as true/false or success/failure.
The distribution is commonly used in biological, geological and financial applications.
A very popular Mendel experiment where two strains of peas (one green and wrinkled seeds and other is yellow and smooth seeds) are hybridized that produced four different strains of seeds-green and wrinkled, green and round, yellow and round, and yellow and wrinkled. This resulted in multinomial distribution and led to the discovery of the basic principles of genetics.
The density function for multinomial distribution is
Where n= number of experiments.
Px= probability of occurrence of an experiment.
The graph of exponential distribution is shown below;
The following are properties of multinomial distribution;
An experiment can have a repeated number of trials, for example, rolling of a dice multiple times.
Each trial is independent of each other.
The success probability of each outcome must be the same (constant) for all trials of an experiment.
Beta distribution comes under continuous probability distributions having the interval [0,1] with two shape parameters that can be expressed by alpha (ɑ) and beta(ꞵ). These two parameters are the exponent of a random variable and control the shape of the distribution.
The distribution shows the family of probabilities and is a suitable model to depict random behaviour of percentages or proportions. It is used for the data models that hold uncertainties of the success probabilities in a random experiment.
The probability density function for the beta distribution is
Where 𝝱 is second shape parameter and B( ɑ, ꞵ) is normalizing constant that makes sure area under the curve is one.
The graph of beta distribution is shown below;
The general formulation of beta distribution is also known as the beta distribution of first kind and beta distribution of second kind is another name of beta prime distribution.
Beta distribution has many applications in statistical description of allele frequencies in genetic population, time allocation in project management, sunshine data, proportions of minerals in rocks, etc.
(Referred blog: Conditional Probability)
A data distribution is said to be beta-binomial if the
Probability of success, p, is greater than zero.
And, shape of beat binomial parameter, α > 0, as well as β > 0
Being the simplest form of Bayesian mode, beta-binomial distribution has extensive applications in intelligence testing, epidemiology, and marketing.
The graph of beta-binomial distribution looks as below;
The parametric shape can be defined in the form of the probability of success such that
A distribution tends to a binomial distribution for the greater value of α and β.
The value of discrete uniform distribution is equivalent to the distribution between 0 to n, if both the values α = β = 1.
For n = 1, the beta-binomial distribution is approximately the same as Bernoulli distribution.
Talking about the key difference amid a beta-distribution and binomial distribution, the success probability, p, is always fixed for a set of trials whereas it is not fixed for beta-binomial distribution and changes trail to trail.
In statistics, t-distribution is the most important distribution, also known as student’s t-distribution. It is employed to estimate population parameters when the sample size is small, and the standard deviation is unknown.
It is widely used for hypothesis testing and built confidence intervals for mean values. The graph of t-distribution distribution is shown below;
T-distribution has the following properties;
Similar to normal distribution, the t-distribution has bell-shaped curve distribution and is symmetric when mean is zero.
The shape of distribution doesn’t alter with degrees of freedom, and has the range – ∞ to ∞.
The variance is always more than one.
As the sample size, n, increases, t-distribution acts as normal distribution where the considered sample size is greater than 30.
(Must check: T-test vs Z-test)
Uniform distribution can either be discrete or continuous where each event is equally likely to occur. It has a constant probability constructing a rectangular distribution.
In this type of distribution, an unlimited number of outcomes will be possible and all the events have the same probability, similar to Bernoulli’s distribution. For example, while rolling a dice, the outcomes are 1 to 6 that have equal probabilities of ⅙ and represent a uniform distribution.
A variable X is said to have uniform distribution if the probability density function is
The graph of a uniform distribution looks as below
The uniform distribution has the following properties;
The probability density function combines to unity.
Every input function has an equal weightage.
(Must check: 4 types of data in statistics)
To sum up, we have seen various types of statistical data distribution models along with their probability density distribution functions, graphical representations and common properties.
6 Major Branches of Artificial Intelligence (AI)READ MORE
Reliance Jio and JioMart: Marketing Strategy, SWOT Analysis, and Working EcosystemREAD MORE
8 Most Popular Business Analysis Techniques used by Business AnalystREAD MORE
Top 10 Big Data TechnologiesREAD MORE
Elasticity of Demand and its TypesREAD MORE
What is PESTLE Analysis? Everything you need to know about itREAD MORE
An Overview of Descriptive AnalysisREAD MORE
5 Factors Affecting the Price Elasticity of Demand (PED)READ MORE
Dijkstra’s Algorithm: The Shortest Path AlgorithmREAD MORE
What Are Recommendation Systems in Machine Learning?READ MORE