Are the observed changes in mean statistically significant?
This is perhaps a major consideration while making a critical hypothesis that gives a perfect analysis for a condition. Such analysis are the excellent candidates for hypothesis testing, or in other words, significance testing.
For testing the hypotheses various test statistics are performed, such as t-test and z-test, and that will be the main course of discussion during the blog.
We will cover main topics as;
About Hypothesis Testing
What is Z-test?
What is T-test?
Z-test vs T-test
About Hypothesis Testing
Let’s start with a simple situation: you are a company, monitoring the daily clicks on blogs and want to analyze whether the outcomes of the current month are different from the previous month’s outcomes.
For example, are they different due to a particular marketing campaign, or any other reason.
In order to check this piece of activity, hypothesis testing is performed in terms of null hypothesis and alternative hypothesis.
Hypotheses are the predictive statements that are capable of being tested in order to give connections between an independent variable and some dependent variables.
Here, the question to be researched for is converted into;
Null hypothesis (H0), it states that there is “no difference,” and
Alternative hypothesis (H1), it states that there is “the difference in population”.
Assuming that average clicks on blogs is 2000 per day before marketing campaign, you believe that population has now higher average clicks due to this campaign, such that
Here the observed mean is >2000, and expected population mean is 2000. Next step would be to run test statistics that compare the value of both means.
(Related blog: What is Confusion Matrix?)
What is p-value?
The calculated value of the test statistic is converted into a p-value that explains whether the outcome is statistically significant or not.
For a brief, a p-value is the probability that the outcomes, from sample data, have occurred by chance, and varies from 0% to 100%. In general, these values are written in decimal format, like a p-value of 5% is written as 0.05.
Lower p-values are considered to be favorable, as they indicate that data didn’t happen by chance.
For example, if p-value is 0.01, it means that there is 1% probability that, from an event, the results have appeared by chance. However, a p-value of 0.05 is ideally acceptable, signifying that data is valid.
Here, the test statistic is a numerical summary of the data which is compared to what would be expected under null hypothesis.
It can take many forms such as t-test (usually used when the dataset is small) or z-test etc (preferred when the dataset is large), or ANOVA test, etc.
Level of significance is the amount of some percentage that is required to reject a null hypothesis when it is true, it is denoted by 𝝰 (alpha). In general, alpha is taken as 1%, 5% and 10%.
Confidence level: (1-𝝰) is accounted as confidence level in which null hypothesis exists when it is true.
For instance, assuming the level of significance as 0.05, then smaller the p-value (generally p≤ 0.05), rejecting the null hypothesis. As this is a substantial confirmation against the null hypothesis that proves it is invalid.
Also, if the p-value is greater than 0.05, accepting the null hypothesis. As this gives evidence that alternate hypothesis is weak therefore null hypothesis can be accepted.
(Suggested blog: Mean, median, & mode)
Significance of p-value
The p-value is only a piece of information that signifies the null hypothesis is valid or not.
Ideally, following rules are used in determining whether to support or reject the null hypothesis;
If p > 0.10 : the observed difference is “not significant”
If p ≤ 0.10 : the observed difference is “marginally significant”
If p ≤ 0.05 : the observed difference is “significant”
If p ≤ 0.01 : the observed difference is “highly significant.”
(Must read: What is Precision, Recall & F1 Score in Statistics?)
At the level of significance as 0.05, a one-tailed test allows the alpha to test the statistical significance in one single direction of interest, this simply implies that alpha = 0.05 is at the one tail of distribution of test statistics.
A test is one-tailed when the alternative hypothesis is stated in terms of “less than” or “greater than”, but not both. A direction must be selected before testing.
It tells the effect of changes in one direction only, not in another direction.
One- tailed test can be performed in two forms, i.e.,
Left tailed test:
It is used when
Left tailed test
Right tailed test:
It is used when;
Right tailed test
Two tailed Test
While taking the significance level as 0.05, a two-tailed test allows half of the alpha level to test statistical significance at one single direction and half alpha level in another direction such that significance level of 0.025 in each tail of the distribution of test statistics.
Two tailed test
In two tailed tests, we test the hypothesis when the alternate hypothesis is not in the form of greater than or less than. When an alternate hypothesis is defined as there is difference in values (such as means of the sample), or observed value is not equal to the expected value.
Where a specific direction needs not to be defined before testing, a two-tailed test also takes into consideration the chances of both a positive and a negative effect.
(Suggested blog: Conditional Probability)
What is Z-test?
Z-test is the statistical test, used to analyze whether two population means are different or not when the variances are known and the sample size is large.
This test statistic is assumed to have a normal distribution, and standard deviation must be known to perform an accurate z-test.
A z-statistic, or z-score, is a number representing the value’s relationship to the mean of a group of values, it is measured with population parameters such as population standard deviation and used to validate a hypothesis.
For example, the null hypothesis is “sample mean is the same as the population mean”, and alternative hypothesis is “the sample mean is not the same as the population mean”.
(Also check: Importance of Statistics and Probability in Data Science)
The z-statistics refers to the statistics computed for testing hypotheses, such that,
Given: From normally distributed population, a random sample of size n is selected with population mean μ and variance σ2, and
A sample mean X with sample size is greater than 30.
The above formula is used for one sample z-test, if you want to run two sample z-test, the formula for z-statistic is
(Read blog: Data Types in Statistics)
What is T-test?
In order to know how significant the difference between two groups are, T-test is used, basically it tells that difference (measured in means) between two separate groups could have occurred by chance.
This test assumes to have a normal distribution while based on t-distribution, and population parameters such as mean, or standard deviation are unknown.
The ratio between the difference between two groups and the difference within the group is known as T-score. Greater is the t-score, more is the difference between groups, and smaller is the t-score, more similarities are there among groups.
For example, a t-score value of 2 indicates that the groups are two times as different from each other as they are with each other.
(Must read: What is A/B Testing?)
Also, after running t-test, if the larger t-value is obtained, it is highly likely that the outcomes are more repeatable, such that
Mainly, there are three types of t-test:
An Independent Sample t-test, compare the means for two groups.
A Paired Sample t-test, compare means from the same group but at different times, such as six months apart.
A One Sample t-test, test a mean of a group against the known mean.
One sample T-test
The t-statistics refers to the statistics computed for hypothesis testing when
Population variance is unknown with sample size is smaller than 30.
Sample standard deviation is used at place of population standard deviation, and,
The sample distribution must either be normal or approximately normal.
T-test vs Z-test
It is certainly a tricky choice that a particular test statistics would be selected in what conditons, in the below diagram, a comparison is demonstrated between z-test and t-test relying on specific conditions.
Comparing T-test and Z-test
As the sample size differs from analysis to analysis, a suitable test for hypothesis testing can be adopted for any sample size. For example, z-test is used for it when sample size is large, generally n >30.
Whereas t-test is used for hypothesis testing when sample size is small, usually n < 30 where n is used to quantify the sample size.
The t-test is the statistical test that can be deployed to measure and analyze whether the means of two different populations are different or not when the standard deviation is not known.
The z-test is the parametric test, implemented to determine if the means of two different datasets are different from each other, when the standard deviation is known.
Types of distribution
Both t-test and z-test employ the different use of distribution to correlate values and make conclusions in terms of hypothesis testing.
Notably, t-test is based on the Student’s t-distribution, and the z-test counts on Normal Distribution.
(Related blog: What is Statistics?)
Implementing both tests in testing of hypothesis, population variance is significant in obtaining the t-score and z-score.
While the population variance in the z-test is known, it is unknown in the t-test.
Some major assumptions are considered while conducting either t-test or z-test.
In a t-test,
All data points are assumed to be not dependent.
Samples values are taken and recorded accurately.
Work on smaller sample size, n should not exceed thirty but also shouldn't be less than five.
In the z-test,
All data points are independent,
Sample size is assumed to be large, n should have exceeded thirty.
Normal distribution for z with mean zero and variance as one.
The t-test and z-test are the substantive tests in determining the significance difference between sample and population. While the formulas are similar, the selection of a particular test relies on sample size and the standard deviation of population.
From the above discussion, we can conclude that t-test and z-test are relatively similar, but their applicability is different such as the fundamental difference is that the t-test is applicable when sample size is less than 30 units, and z-test is practically conducted when size of the sample crosses the 30 units.
(Must read: Clustering Methods and Applications)
Similarly, there are other essential differences as well which have been seen in the blog. We hope this made a clear understanding of the differences between the both z-test and t-test.