Hypothesis testing is the act of testing a hypothesis or a supposition in relation to a statistical parameter. Analysts implement hypothesis testing in order to test if a hypothesis is plausible or not.
In data science and statistics, hypothesis testing is an important step as it involves the verification of an assumption that could help develop a statistical parameter. For instance, a researcher establishes a hypothesis assuming that the average of all odd numbers is an even number.
In order to find the plausibility of this hypothesis, the researcher will have to test the hypothesis using hypothesis testing methods. Unlike a hypothesis that is ‘supposed’ to stand true on the basis of little or no evidence, hypothesis testing is required to have plausible evidence in order to establish that a statistical hypothesis is true.
Perhaps this is where statistics play an important role. A number of components are involved in this process. But before understanding the process involved in hypothesis testing in research methodology, we shall first understand the types of hypotheses that are involved in the process. Let us get started!
Types of Hypotheses
In data sampling, different types of hypothesis are involved in finding whether the tested samples test positive for a hypothesis or not. In this segment, we shall discover the different types of hypotheses and understand the role they play in hypothesis testing.
Alternative Hypothesis (H1) or the research hypothesis states that there is a relationship between two variables (where one variable affects the other). The alternative hypothesis is the main driving force for hypothesis testing.
It implies that the two variables are related to each other and the relationship that exists between them is not due to chance or coincidence.
When the process of hypothesis testing is carried out, the alternative hypothesis is the main subject of the testing process. The analyst intends to test the alternative hypothesis and verifies its plausibility.
The Null Hypothesis (H0) aims to nullify the alternative hypothesis by implying that there exists no relation between two variables in statistics. It states that the effect of one variable on the other is solely due to chance and no empirical cause lies behind it.
The null hypothesis is established alongside the alternative hypothesis and is recognized as important as the latter. In hypothesis testing, the null hypothesis has a major role to play as it influences the testing against the alternative hypothesis.
(Must read: What is ANOVA test?)
The Non-directional hypothesis states that the relation between two variables has no direction.
Simply put, it asserts that there exists a relation between two variables, but does not recognize the direction of effect, whether variable A affects variable B or vice versa.
The Directional hypothesis, on the other hand, asserts the direction of effect of the relationship that exists between two variables.
Herein, the hypothesis clearly states that variable A affects variable B, or vice versa.
A statistical hypothesis is a hypothesis that can be verified to be plausible on the basis of statistics.
By using data sampling and statistical knowledge, one can determine the plausibility of a statistical hypothesis and find out if it stands true or not.
(Related blog: z-test vs t-test)
Performing Hypothesis Testing
Now that we have understood the types of hypotheses and the role they play in hypothesis testing, let us now move on to understand the process in a better manner.
In hypothesis testing, a researcher is first required to establish two hypotheses - alternative hypothesis and null hypothesis in order to begin with the procedure.
To establish these two hypotheses, one is required to study data samples, find a plausible pattern among the samples, and pen down a statistical hypothesis that they wish to test.
A random population of samples can be drawn, to begin with hypothesis testing. Among the two hypotheses, alternative and null, only one can be verified to be true. Perhaps the presence of both hypotheses is required to make the process successful.
At the end of the hypothesis testing procedure, either of the hypotheses will be rejected and the other one will be supported. Even though one of the two hypotheses turns out to be true, no hypothesis can ever be verified 100%.
(Read also: Types of data sampling techniques)
Therefore, a hypothesis can only be supported based on the statistical samples and verified data. Here is a step-by-step guide for hypothesis testing.
Establish the hypotheses
First things first, one is required to establish two hypotheses - alternative and null, that will set the foundation for hypothesis testing.
These hypotheses initiate the testing process that involves the researcher working on data samples in order to either support the alternative hypothesis or the null hypothesis.
Generate a testing plan
Once the hypotheses have been formulated, it is now time to generate a testing plan. A testing plan or an analysis plan involves the accumulation of data samples, determining which statistic is to be considered and laying out the sample size.
All these factors are very important while one is working on hypothesis testing.
Analyze data samples
As soon as a testing plan is ready, it is time to move on to the analysis part. Analysis of data samples involves configuring statistical values of samples, drawing them together, and deriving a pattern out of these samples.
While analyzing the data samples, a researcher needs to determine a set of things -
Infer the results
The analysis of data samples leads to the inference of results that establishes whether the alternative hypothesis stands true or not. When the P-value is less than the significance level, the null hypothesis is rejected and the alternative hypothesis turns out to be plausible.
Methods of Hypothesis Testing
As we have already looked into different aspects of hypothesis testing, we shall now look into the different methods of hypothesis testing. All in all, there are 2 most common types of hypothesis testing methods. They are as follows -
Frequentist Hypothesis Testing
The frequentist hypothesis or the traditional approach to hypothesis testing is a hypothesis testing method that aims on making assumptions by considering current data.
The supposed truths and assumptions are based on the current data and a set of 2 hypotheses are formulated. A very popular subtype of the frequentist approach is the Null Hypothesis Significance Testing (NHST).
The NHST approach (involving the null and alternative hypothesis) has been one of the most sought-after methods of hypothesis testing in the field of statistics ever since its inception in the mid-1950s.
Bayesian Hypothesis Testing
A much unconventional and modern method of hypothesis testing, the Bayesian Hypothesis Testing claims to test a particular hypothesis in accordance with the past data samples, known as prior probability, and current data that lead to the plausibility of a hypothesis.
The result obtained indicates the posterior probability of the hypothesis. In this method, the researcher relies on ‘prior probability and posterior probability’ to conduct hypothesis testing on hand.
On the basis of this prior probability, the Bayesian approach tests a hypothesis to be true or false. The Bayes factor, a major component of this method, indicates the likelihood ratio among the null hypothesis and the alternative hypothesis.
The Bayes factor is the indicator of the plausibility of either of the two hypotheses that are established for hypothesis testing.
(Also read - Introduction to Bayesian Statistics)
To conclude, hypothesis testing, a way to verify the plausibility of a supposed assumption can be done through different methods - the Bayesian approach or the Frequentist approach.
Although the Bayesian approach relies on the prior probability of data samples, the frequentist approach assumes without a probability. A number of elements involved in hypothesis testing are - significance level, p-level, test statistic, and method of hypothesis testing.
(Also read: Introduction to probability distributions)
A significant way to determine whether a hypothesis stands true or not is to verify the data samples and identify the plausible hypothesis among the null hypothesis and alternative hypothesis.