In statistical data analysis, the independent variable has an effect on the dependent variable. For example, lack of exercises can lead to weight gain where lack of exercise is independent and weight gain is dependent variable. There can be other influencing factors as well that might affect the dependent variable, known as “Confounding Variable”.
Possibly, whenever a difference between an outcome and an experiment is observed, there is a necessity to consider whether the effect is actual due to experiment, or if alternate descriptions are possible.
Therefore, to validate the research study, substantial factors that might ruin the actual relationship or affect its explanation needs to be taken into consideration precisely.
It simply means “to evaluate the significance of the role of bias” and “accounting the statistical accuracy of research study”.
Bias makes systematic sources of errors that need to be dealt with, as the internal validity of a research study counts on the strata to which biases can be accounted for and required steps considered to curtail their impact.
Bias can be divided into three major categories
Information Bias, and
During the blog discussion, we focus on confounding, recognizing it, and controlling its effects.
Confounding variables or confounders are often defined as the variables that correlate (positively or negatively) with both the dependent variable and the independent variable. A Confounder is an extraneous variable whose presence affects the variables being studied so that the results do not reflect the actual relationship between the variables under study. (Research Paper, 2012)
Cause-effect relationship with confounders
These variables are confounding because they perform such that to confuse and complicate both the findings from the data and the inferences drawn from the study. Due to which it becomes fuzzy whether the conducted experiment caused the effect or the existence of confounding variables have influenced the conclusions.
For a variable to be confounding;
It must have connected with independent variables of interest, and
It must be connected to the outcome or dependent variable directly.
In more precise way, confounding refers to the mixing of effects such that the effect of an experiment, under study on a provided outcome, is mixed with the effect of additional factors (or say a set of factors) that yield in a distortion of an accurate relationship.
Confounding factors can mask the actual relationship between, or demonstrate a supposed relationship falsely between the experiment and outcome when no true association exists amidst them.
Consider the example, in order to conduct research that has the objective that coffee drinkers can have more heart disease than non-coffee drinkers such that they can be influenced by another factor.
For instance, coffee drinkers might consume cigarettes more than non drinkers that act as a confounding variable (consuming cigarettes in this case) to study an association amidst drinking coffee and heart disease.
Example of Confounfing Variable,Image source
As a study outcome, it is found that increasing heart disease can be occured due to consuming cigarettes, not the coffee. Even, in more deep research, it is also observed that drinking coffee can have substantial perks in heart health and in preventing mental disorders.
(Must check: What are information gain and gini index in decision trees?)
Effect of Confounding Variables
During a research, the existence of confounding variables makes it difficult to study a research in order to make a clear-precise connection amid experiment and outcomes unless appropriate methods are employed to adjust the confounders.
Therefore in order to cut down confounding variables, an individual has to assure that all the confounding variables have been identified in the research study.
Understanding the confounding variable can give more accurate results.
The confounding variable can result in major research obstacles in the particular of increased variance and research bias. These effects can make outcomes either overestimated or underestimated at the end,
Referring to an increase in the possible number of causative and independent (explanatory) variables in research, increased variance is very common with research that have no control variables such that changes done in the target variables could be triggered by other variables.
For example, in a conducted research study, it is revealed that lack of exercises can result in an increase in weight gain. Since, there is no control variable, therefore one cannot believe at the research outcome because there might be other factors that can affect the target variables variable (weight gain in this case).
Such as, one of the confounding variables in this experiment can be generic factors, eating habits, so there are many causative factors that end up in misinterpreting the result.
A confounding bias simply implies the chances of a statistical parameter to overestimate or underestimate a research parameter. For example, a survey design has a transparent existence of confounding bias, it can result in huge survey dropout rates and survey response bias that again affect the research outcomes.
For an experiment a confounding might be positive or negative in nature leading in distortion of internal validity or internal efficacy of an experiment.
A positive confounding bias takes place when the observed connection is biased away from the null in a way that it overestimates the outcomes.
A negative confounding bias takes place when the observed connection is biased towards the null in a way that it underestimates the outcomes, even it can give a false rejection of a null hypothesis.
Irrelevant Research Outcomes
In a research study, a confounding variable can change the outcome of an experiment, as an external variable, the third factor can transform both independent and dependent variables in a research and thus affecting outcomes of correlational or experimental research.
During a research experiment, a confounding variable is a third factor that can influence an experiment via generating the wrong research results.
For example, it can render an incorrect correlational association between explanatory and target variables. (from)
How to Control/Prevent from Confounding Variables in Statistics?
There is a much requirement to restrict the effect of confounding variables or confounders during the research process. As a core action, a researcher could control or avoid these variables in research through identifying and measuring the correlated third factor in the research framework.
Possibly, four common tactics are used for reducing the variables, they are
It involves the distribution of confounders over the research data seldomly, and used especially in machine learning for assigning variables randomly in order to manage a group in the research that assists in preventing any condition of selection bias in research tasks.
During an experimental research, randomization enables researchers to control these variables by diverting the experiment to assemblage of observations from considering each individual case where statistical tools are practiced for interpreting the insights.
A random sample is the sample where each element has an equal opportunity to be sampled under the sampling group, since a perfect random sample of observation is problematic to collect, an action of relatively closed randomization is achieved.
This method constraints the study of research variables with the control of confounding variables, and if it is not conducted precisely, it can result in confounding bias. Basically, it restricts the research data by introducing control variables to confine or bound the confounding variables.
Under this method, confounding variables are spread across the research data evenly via controlled research process as before and after experiments. It makes observations in pairs for each value of the independent variable, identical to a possible confounding variable.
Case-control study works as a matching method, it matches variables of similar characteristics with the same set of controls variables. A case-study method can have two or more control variables for each case as it provides more statistical accuracy in the research process.
This method involves testing confounder’s activities by distributing these factors uniformly across every level of research data analysis. From segmenting the data sample into tiny groups to examining the association among the dependent and independent variables, this method involves the assessment of altering effects and controlling of confounding variables.
Besides that, a method of multivariate analysis is used that includes the researcher’s ability to identify and compute all the concerned third factors while conducting the research.
Another method is introducing counterbalancing by examining several research analysis parameters, where half of the group is evaluated under first condition and another half under second condition.
Confounding variables are variables that obscure the effect of other variables.
These variables can be positive or negative to correlate with both dependent and independent variables.
Confounders are extraneous variables and their existence affects the variables being studied such that the outcome would never reflect the actual relationship amid the underlying studied variables.
Confounding results in invalid correlations, increasing variance, and introducing a bias.
Confounding can be prevented/controlled by appropriate methods such as stratification, randomization, matching, restriction and multivariate analysis.