The main idea behind machine learning is to provide human brain-like abilities to our machine, and therefore neural network is like a boon to this ideology. Neural networks are said so because it is inspired by the working of the human brain’s neurons. So, how does the human brain neurons work? And how this structure helped neural networks and deep learning? Let’s discuss this all.
Overview of Neural Network
Structure of Neural Network
Neural Network Implementation Using Keras Functional API
Application of Neural Network
Let’s see the basic structure of neurons working in the human brain.
Working of neuron
Dendrites: that gives input to the nucleus of the neuron.
Nucleus: that processes the information.
Axon: that acts as an output layer.
Basic neural network structure
Input x1, x2, x3 which contains information in association with weights w1, w2, w3 acts as input layer and is stored in a matrice form known as hidden layers. Further in the structure ∑ shows the activation function which acts as a decision-maker and allows only certain user information to fire forward further in the network towards the output layer.
Input -> matrix activation -> activation -> output
Here, the activation function decides which feature or information to fire forward towards output in order to minimize error. Generally, the sigmoid function or softmax is seen to be preferred by data scientists and machine learning engineers.
Here is the sigmoid function
f(x) =1 + 1/e-x
Other activation functions that are widely used and accepted are Tanh and softmax.
Now we shall visualize
Neural network architecture
Above is the structure followed by Neural Networks, firstly we have an input layer which includes a dataset (either labeled or unlabelled) then there are hidden layers, we can use as many hidden layers as we want as all it does is the extraction of informative features from the dataset, we must choose our number of hidden layers wisely as too many features can lead to overfitting which may disturb the accuracy of our model to some extent.
Lastly, we have our final layer which is the output layer to give results. For more accuracy, we train our data again, and again till then, it learns all the features that are required. This information as input is stored as a matrix form which includes information with weight and bias associated with it.
Loss compilation and reducing the loss function is one of the most important work to do in neural networks, we reduce our loss function using a very intuitive algorithm known as gradient descent which finds out the error and minimizes it, in the mathematical statement, it can optimize the convex function.
Take random 𝚹
Update 𝚹 in a direction of decreasing gradient(slope)
𝚹 = 𝚹 - ղ*ძf(𝚹) / ძ(𝚹)
Here ղ is the learning rate, we have to repeat step 2 until we reach the local minima.
Like we teach a child when he makes mistakes, our model is also like that child, it makes mistakes and needs someone to teach it whenever it makes mistakes, this is handled by an algorithm known to be Backpropagation.
It works with the help of gradient descent and other functionality. It moves in a backward direction for re-training the network by changing weights and this retraining happens till our model gives us optimum results with the least possible errors. This algorithm is a work of David Rumelhart, when in 1986 he published a famous note on this algorithm, although it introduced a long back in 1970.
The architecture of the Neural Network
In the above visualization, two images are provided as an input, our model processes and learn the features of input images, further our model becomes capable of classifying both images on the basis of features it has learned as we can see in our output layer.
Keras API is an initiative to decrease the complexity of implementing deep learning and machine learning algorithms, there are mainly two Keras API’s that is majorly used for implementing deep learning models such as neural networks and more, these two API’s are-:
In neural networks, the structure suggests that we should make linear chains of interconnected input and output layers, Keras helps us smartly here, it works like a directed Acyclic Graph i.e it connects one layer with other just like we connect two DAG.
The architecture of Keras Functional API
Firstly, we have to import all the required libraries.
import numpy as np import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers
Create Dataset (For Demo)
inputs = keras.Input(shape=(784,)) img_inputs = keras.Input(shape=(32, 32, 3)) inputs.shape
Output: TensorShape([None, 784])
Here we are giving inputs to the layers named dense and x is an output we are going to get.
dense = layers.Dense(64, activation="relu") x = dense(inputs) x = layers.Dense(64, activation="relu")(x) outputs = layers.Dense(10)(x) model = keras.Model(inputs=inputs, outputs=outputs, name="mnist_model") model.summary()
The model summary of neural network
Just like the structure we discussed, we got the same summary of the model.
For a layered model, another powerful Keras API is Sequential API, it helps in most of the layered structured models such as neural networks, Although sequential is slightly less useful than functional API due to the limitation on the number of layers a model can share. But most of the deep learning models could be solved using both or that depends upon what kind of structure you want, Functional API allows you to have multiple inputs and outputs which is not the case with sequential API, therefore we can conclude that Keras Functional API is more Flexible than Sequential API. To power the statement we made that says most of the deep learning model could be implemented using Keras Sequential API too, we will implement Neural Network Using Keras Sequential API
import numpy as np import matplotlib.pyplot as plt from pandas import read_csv from sklearn.model_selection import train_test_split import keras from keras.models import Sequential from keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Activation from keras.utils import np_utils
Importing every necessary library, including train_test_split from sklearn and also importing layers like convolutional 2D, Activation, Max pooling, etc.
dataset = read_csv('../datasets/fashion-mnist-test.csv') dataset.head()
The visualization of dataset
Reading our dataset with the help of the panda's library and visualizing our data. we can analyze the shape of our dataset which contains 1000 rows and 785 columns.
X, y = dataset.iloc[:, 1:], dataset.iloc[:, 0] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2) y_train = np_utils.to_categorical(y_train) y_test = np_utils.to_categorical(y_test) y_train.shape, y_test.shape
Output: ((8000, 10), (2000, 10))
In this step we specified x and y, afterward, we did splitting into training and testing (80% - 20%).
Here np.utils convert a class integer to the binary class matrix for use with categorical cross-entropy.
X_train, X_test = X_train.reshape((-1,28,28,1)), X_test.reshape((-1,28,28,1)) X_train.shape, X_test.shape, y_train.shape, y_test.shape
Output: ((8000, 28, 28, 1), (2000, 28, 28, 1), (8000, 10), (2000, 10))
Reshaping our x_train and x_test for use in conv2D. And we can observe the change in the shape of our data.
model = Sequential() model.add(Dense(256, input_shape=(784,))) model.add(Activation('sigmoid')) model.add(Dense(64)) model.add(Activation('tanh')) model.add(Dense(10)) model.add(Activation('softmax'))
Initializing our model, first addition adds input layer, another layer is hidden layer 1 and next is the output layer. We can observe that we have taken different activation functions such as sigmoid, tanh, and softmax. All these are one of a kind activation function.
The model summary of layers in a neural network
We can see the output shape at every layer with the number of parameters.
There are none non-trainable parameters which means every parameter has been analyzed.
# Initialize Weights model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
As commented in our code, we are initializing our weights here.
hist = model.fit(X_train, y_train, shuffle=True, epochs=40, batch_size=128, validation_data=(X_test, y_test) )
Fitted our training data to our model with 100 epochs and 256 as batch size.
Epochs are the number of times we need to validate our data and batch size which contains all the parameters but are computed simultaneously.
plt.figure(0) plt.title("Loss") plt.plot(hist.history['loss'], 'r', label='Training') plt.plot(hist.history['val_loss'], 'b', label='Testing') plt.legend() plt.show() plt.figure(1) plt.title("Accuracy") plt.plot(hist.history['acc'], 'r', label='Training') plt.plot(hist.history['val_acc'], 'b', label='Testing') plt.legend() plt.show()
The Comparison of Loss between Training and Test set
The Comparison of Accuracy between Training and Test set
Plotting our data, here we can see the slight difference between the loss of our training and testing data, we can also observe the difference between the accuracy of our training and testing data.
To Solve a Regression Problem - In predicting an accurate continuous value, we can use a simple neural network.
For Clustering - If the given dataset is unlabelled or unsupervised, our neural network will form clusters to distinguish classes.
Pattern Recognition - There are feedback neural networks which help in tasks like pattern recognition.
Dimension Reduction - To understand our data and to extract maximum features out of the data we need to reduce its dimension which can be easily done with the help of artificial neural networks.
Machine Translation - We must have used keyboards that translates from one language to another, this is nothing but a machine translation which can be achieved using neural networks.
Neural networks are the foundation of deep learning which helps in performing various tasks, we have learned about the basics of how it works and learned the coding part with the help of Keras functional API. For more blogs keep exploring and keep reading Analytics Steps.
6 Major Branches of Artificial Intelligence (AI)READ MORE
Reliance Jio and JioMart: Marketing Strategy, SWOT Analysis, and Working EcosystemREAD MORE
8 Most Popular Business Analysis Techniques used by Business AnalystREAD MORE
Top 10 Big Data TechnologiesREAD MORE
Elasticity of Demand and its TypesREAD MORE
What is PESTLE Analysis? Everything you need to know about itREAD MORE
An Overview of Descriptive AnalysisREAD MORE
5 Factors Affecting the Price Elasticity of Demand (PED)READ MORE
Dijkstra’s Algorithm: The Shortest Path AlgorithmREAD MORE
What Are Recommendation Systems in Machine Learning?READ MORE