The motive of the blog is to give you some ideas on the usage of “Activation Function” & “Loss function” in different scenarios. I assume you have a fair idea about activation functions and loss functions.
Choosing an activation function and loss function is directly dependent upon the output you want to predict. There are different cases and different outputs of a predictive model. Before I introduce you to such cases let see an introduction to the activation function and loss function.
The activation function activates the neuron that is required for the desired output, converts linear input to non-linear output. If you are not aware of the different activation functions I would recommend you visit here to get an in-depth explanation of different types of activation functions here.
Loss function helps you figure out the performance of your model in prediction, how good the model is able to generalize. It computes the error for every training. You can read more about loss functions and how to reduce the loss here.
It is said that the Goal decides how to validate the performance of the business. Similarly, output decides what loss function and activation function is to be used.
Let’s see the different cases:
Consider predicting the prices of houses provided with different features of the house. A neural network structure where the final layer or the output later will consist of only one neuron that reverts the numerical value. For computing the accuracy score the predicted values are compared to true numeric values.
Activation Function to be used in such cases,
Linear Activation - This type of activation function gives the output in a numeric form that is the demand for this case.
Linear Activation Function
ReLU Activation - This activation function gives you positive numeric outputs as a result.
ReLu Activation Function
Loss function to be used in such cases,
Mean Squared Error (MSE) - This loss function is responsible to compute the average squared difference between the true values and the predicted values.
Consider a case where the aim is to predict whether a loan applicant will default or not. In these types of cases, the output layer consists of only one neuron that is responsible to result in a value that is between 0 and 1 that can be also called probabilistic scores.
For computing the accuracy of the prediction, it is again compared with the true labels. The true value is 1 if the data belongs to that class or else it is 0.
Activation Function to be used in such cases,
Sigmoid Activation - This activation function gives the output as 0 and 1.
Sigmoid Activation Function
Loss function to be used in such cases,
Binary Cross Entropy - The difference between the two probability distributions is given by binary cross-entropy. (p,1-p) is the model distribution predicted by the model, to compare it with true distribution, the binary cross-entropy is used.
Consider a case where you are predicting the name of the fruit amongst 5 different fruits. In the case, the output layer will consist of only one neuron for every class and it will revert a value between 0 and 1, the output is the probability distribution that results in 1 when all are added.
Each output is checked with its respective true value to get the accuracy. These values are one-hot-encoded which means if will be 1 for the correct class or else for others it would be zero.
Activation Function to be used in such cases,
Softmax Activation - This activation function gives the output between 0 and 1 that are the probability scores which if added gives the result as 1.
Softmax Activation Function
Loss function to be used in such cases,
Cross-Entropy - It computes the difference between two probability distributions.
(p1,p2,p3) is the model distribution that is predicted by the model where p1+p2+p3=1. This is compared with the true distribution using cross-entropy.
Consider the case of predicting different objects in an image having multiple objects. This is termed as multiclass classification. In these types of cases, the output layer consists of only one neuron that is responsible to result in a value that is between 0 and 1 that can be also called probabilistic scores.
For computing the accuracy of the prediction, it is again compared with the true labels. The true value is 1 if the data belongs to that class or else it is 0.
Activation Function to be used in such cases,
Sigmoid Activation - This activation function gives the output as 0 and 1.
Sigmoid Activation Function
Loss function to be used in such cases,
Binary Cross Entropy - The difference between the two probability distributions is given by binary cross-entropy. (p,1-p) is the model distribution predicted by the model, to compare it with true distribution, the binary cross-entropy is used.
The below table concludes to quickly check which activation and loss function to use in different problem statements and desired outputs.
Activation and Loss Function
You can also check documentation where different activation functions are explained with the code here and there are other loss functions also that are used to compute loss; you can refer to them here.
It is very important to check which activation function and loss function is to be used in different problem scenarios in machine learning or deep learning models. Often people get confused about the usage of these functions. I have tried to give you the idea when to which type of activation and loss functions.
I have discussed different cases where the output is either binary, numerical, single label or multiple labels and corresponding activation and loss functions to be used.
Reliance Jio and JioMart: Marketing Strategy, SWOT Analysis, and Working Ecosystem
READ MORE6 Major Branches of Artificial Intelligence (AI)
READ MORETop 10 Big Data Technologies
READ MORE8 Most Popular Business Analysis Techniques used by Business Analyst
READ MORE7 types of regression techniques you should know in Machine Learning
READ MOREDeep Learning - Overview, Practical Examples, Popular Algorithms
READ MOREIntroduction to Time Series Analysis in Machine learning
READ MOREWhat is the OpenAI GPT-3?
READ MOREHow Does Linear And Logistic Regression Work In Machine Learning?
READ MORE7 Types of Activation Functions in Neural Network
READ MORE
Comments