What is AdaBoost Algorithm in Ensemble Learning?

  • Tanesh Balodi
  • Jul 17, 2021
  • Machine Learning
  • Python Programming
What is AdaBoost Algorithm in Ensemble Learning? title banner

There is an old story where a father gives a pile of 4 wooden sticks and ask each of his sons to break that pile of sticks, everyone fails to do so, afterwards, the father gave every individual, one wooden stick from that pile and ask them to break it now, his sons were able to break it now. In this story, we learned that, individually one might be weak, but when we combine or whenever we form unity, we can become strong. This is what boosting is all about.

 

Boosting combines all the weak learners (the parameters that could not classify the problem properly) and after the combination, the majority vote is taken to classify which category the input falls in. 

 

(Suggested blog: Machine learning algorithms)

 

Let’s consider an example, suppose we need to classify whether the given image is of a horse or a donkey, there could be various factors on which we could determine like height, width, long tail, and more. The problem is none of these factors can tell perfectly that the given image is of a donkey or a horse, therefore we will consider all these factors and do the majority voting, in short, making all the weak learners combine to form a strong predictor.


image1


Above are the few weak learners by which we got some output, now if we do the majority voting, we will find that most of the weak learners tell that the input might be a horse. This is the concept of boosting.

 

 

What is ensemble learning?

 

Ensemble learning is used to boost up the machine learning model’s accuracy and efficiency, to enhance the accuracy, ensemble learning takes the decisions from various models and combines them in a few ways to get the best decision. These few ways are max voting that we discussed earlier or by taking the average.

 

The average method is easy to implement, considering you have three prediction scores p1, p2, and p3 from different models like logistic regression, decision tree, and K-nearest neighbour (KNN). Now in order to take the average, all we need to do is-:

 

p1 + p2 + p3 / 3

 

Some of the advanced techniques under ensembled learning are bagging and boosting. Let’s discuss both of them briefly-:

 

  1. Bagging

 

The term bagging here represents that the original dataset is distributed in several parts,  where each part acts as a dataset for an individual model. The main dataset is divided into equal parts, to make the sub dataset size same as the original, we add some replacement so that we would have enough features to learn something from the dataset. This process is known as bootstrapping.


An image is showing the process fo bagging in ensembled learning

Bagging in Ensembled Learning


Above image is the perfect representation of bagging, here the original dataset is distributed into three subset, each sub dataset is fed to an model, most of the time it is decision tree, after that each model gives some predictions and at the end we combine all the predictions and with the help of max voting or averaging, we get the strong predictions, all the weak learners here, combines to form a strong learner and helps in increasing the accuracy of the model.

 

(Also read: What is LightGBM Algorithm?)

 

 

  1. Boosting

 

Boosting is a method which adds an extra layer of perfection to the model, you are now aware that in bagging each sub dataset goes through a model to predict an output, boosting comes into the picture to reduce the errors from subsequent models. There are a few steps involved in the working of boosting, let’s discuss them-:

 

Step1:

 

The base algorithm assigns weights to the data points of the sub datasets in order to find the errors.

 

Step 2:

 

Errors are calculated using the difference between predicted values and actual values.

 

Step 3:

 

The data points which showed the error gets their weights updated, all these data points are assigned higher weights.

 

Step 4:

 

Another model is trained with the updated weights and this model tends to perform better than the previous one.

 

Step5:

 

This way every consecutive model learns from the previous one and gives the better result, at the end the mean of all the outputs are taken to form an optimum outcome.

 

Boosting Algorithm

 

Above representation shows the combination of all the weak learners with their updated data points, and at the end we can see the generalized outcome from the combination of these weak learners. This concept also shows that a model may not perform well on the whole dataset, but can give better results when trained over a portion of the whole dataset.

 

Some of the boosting algorithms are-:

 

  1. AdaBoost

  2. XGBoost

  3. CatBoost

  4. Gradient Boosting

 

Let’s implement the AdaBoost algorithm, the principle remains the same as boosting in ensemble learning.

 

 

AdaBoost Algorithm Python Implementation Using Sklearn

 

Our first step here is to pre-process the dataset and split the dataset for training and testing part.

 

Step1: import necessary libraries.

 

import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

 

We have imported pandas in order to read the dataset, numpy, and matplotlob python libraries.

 

ds = pd.read_csv('../datasets/titanic.csv')

print(ds.head())

 

Reading the dataset and printing a few features-:


image2


This is how the dataset looks like.

 

In our next step, we are going to assign a numerical value to the sex of people, for males, we are going to assign ‘0’ and ‘1’ for the later.

 

df = ds[['Pclass', 'Sex', 'Age', 'SibSp', 'Parch', 'Fare', 'Survived']]

def quantify_sex(sex):

    if sex.lower().startswith('m'):

        return 0

    elif sex.lower().startswith('f'):

        return 1

    else:

        return sex

    

ds['Sex'] = ds['Sex'].apply(quantify_sex)



In our next step we are going to assign a variable ‘X’ to the people who did not survived and ‘y’ to the one that survived.

X = ds[[each for each in ds.columns if each != "Survived"]]

y = pd.DataFrame(ds['Survived'], columns=['Survived'])



Importing train_test_split from sklearn to split the dataset.

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)



Now, import AdaBoostClassifier from the sklearn and fit the dataset.

 

from sklearn.ensemble import AdaBoostClassifier

ac = AdaBoostClassifier(random_state=90)

 

Now in the next step we are training our model, as we can see with the help of sklearn and the adaboost classifier, the process has become really easy.

 

ac.fit(X_train, y_train)



Now, its time to calculate the accuracy or score of our adaboost classifier.



ac.score(X_train, y_train)

 

image3

 

We can see on the training dataset, the accuracy is about 85%. Now we shall calculate the accuracy score for testing the dataset.

 

ac.score(X_test, y_test)

 

image4

 

While calculating the score for the testing dataset, we got the accuracy of 83 percent overall.

 

 

Conclusion

 

Adaboost algorithm is an exceptional method to boost up the performance of a machine learning model by combining all the weak learners together to form a strong predictor. 

 

(Must read: Machine Learning Tools)

 

However, if the weak learners are in fact a lot weak, it may lead to the overfitting and if we dig in some more, we will find that boosting is very difficult to scale. By keeping a few limitations in mind, if the goal is to increase the productivity of the model, we must use this algorithm.

Comments