What are Autoencoders? How to Implement Convolutional Autoencoder Using Keras

  • Tanesh Balodi
  • Jul 30, 2021
  • Machine Learning
  • Python Programming
What are Autoencoders? How to Implement Convolutional Autoencoder Using Keras title banner

How would you react if I’d say that there is a deep learning algorithm that could regenerate the image all by itself just by learning and if the regenerated image is tuned properly, it would be almost impossible to detect the difference between original image and regenerated one. One such machine learning algorithm is autoencoder.


So one thing is clear that with the help of an autoencoder we are trying to regenerate the original input, but how does autoencoder work in order to perform regeneration of input data? It uses a neural network to perform its function, let’s see how.



Working of Autoencoder


The working of autoencoder includes two main components-:


  1. Encoder

  2. Decoder

The image is representing the architecture of an autoencoder.

Architecture of an Autoencoder

We can see with the help of the above figure that the input is fed to the model, this input goes to the encoder which extracts some information, the compression is done to the image before sending it to the decoder that gives output at the end, this is the general way to introduce an architecture of autoencoder, but the question is what are encoder and decoder?


Encoder and decoder are nothing but a neural network, input is fed to an neural network that extracts useful features from the input, but the point here is that an autoencoder doesn’t just need every information that neural network offers, it need precisely the features which will help him regenerate the input. So let’s understand how a neural network would get the job done.

the interconnection of hidden layers with output layer and input layer of an neural network is represented with the help of this image

Architecture of Simple Neural Network

Above is an architecture of a neural network with one input layer, two hidden layers, and at the last we have an output layer. Input layer gets all the input data or raw data, hidden layer is extremely important as it is the layer which is responsible for extracting out some useful information and features from the input. 


The attribute we want from the hidden layer is a latent attribute from the input in order to regenerate it. The main job here is to see how many hidden layers we need as we don’t want too little or too much information from the layer. 


One interesting thing about autoencoders is that although we mainly use it for the regeneration purpose, it is also an excellent dimensionality reduction technique as it uses a neural network. Autoencoder is fully capable of learning non-linear surfaces, which makes it even better than principal component analysis as far as dimensionality reduction is concerned.



Implementing Autoencoder using Keras


Our first step here is to import various libraries such as numpy, pandas, matplotlib to perform basic operations such as numerical operation, reading datasets, data visualization respectively. 


We also need to import some major packages from keras to perform the regeneration of image, these packages are Conv2D( to add convolutional layer), maxpool2D(to calculate max value from the convolutional spot), ZeroPadding2D(to add some padding), UpSampling2D, Input, Dense, Activation, Flatten, and Reshape.

import numpy as np

import matplotlib.pyplot as plt

import keras

from keras.models import Model

from keras.layers import Conv2D, MaxPool2D, ZeroPadding2D, \

                         UpSampling2D, Input, Dense, Activation, Flatten, Reshape

from keras.utils import np_utils

from pandas import read_csv

from sklearn.model_selection import train_test_split

In our next step we are reading ‘mnist’ dataset and storing it in a variable named dataset, we also have to split the dataset into training and testing to perform testing on some data and others for training the model. As the last part of data pre-processing, we need to reshape our training and testing data.

dataset = read_csv('../datasets/mnist_data/train.csv')

dataset = dataset.values

X, y = dataset[:, 1:]/255, dataset[:, 0]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

X_train, X_test = X_train.reshape((-1,28,28,1)), X_test.reshape((-1,28,28,1))

X_train.shape, X_test.shape

((33600, 28, 28, 1), (8400, 28, 28, 1))

Using the architecture we discussed above using keras, here, we are making use of every package we imported from keras. The below code is nothing but the representation of the architecture itself. We are using relu and sigmoid activation functions for convolutional and dense layers respectively.


(Recommended: 7 types of activation function)

e = 256

inp = Input(shape=(28,28,1))

conv1 = Conv2D(32, (3,3), activation='relu')(inp)

conv2 = Conv2D(16, (3,3), activation='relu')(conv1)

mp1   = MaxPool2D((2,2))(conv2)

conv3 = Conv2D(8, (3,3), activation='relu')(mp1)

flat = Flatten()(conv3)

emb = Dense(e, activation='sigmoid')(flat)

fc1 = Dense(800, activation='sigmoid')(emb)

res = Reshape((10,10,8))(fc1)

zp1 = ZeroPadding2D((1,1))(res)

conv4 = Conv2D(16, (3,3), padding='same', activation='relu')(zp1)

up1 = UpSampling2D((2,2))(conv4)

zp2 = ZeroPadding2D((1,1))(up1)

conv5 = Conv2D(32, (3,3), padding='same', activation='relu')(zp2)

zp3 = ZeroPadding2D((1,1))(conv5)

conv6 = Conv2D(1, (3,3), padding='same', activation='relu')(zp3)

cae = Model(inputs=inp, outputs=conv6)


image 1

image 2

Now we need to optimize it using adam optimizer.

cae.compile(loss='mse', optimizer='adam', metrics=['accuracy'])



image 3

encoder = Model(inputs=inp, outputs=emb)


image 4

image 5

decoder_input = Input(shape=(e,))

dec_layer = cae.layers[7](decoder_input)

for i in range(8, len(cae.layers)):

    dec_layer = cae.layers[i](dec_layer)

decoder = Model(inputs=decoder_input, outputs=dec_layer)


We have seen the summary of encoder and decoder, now it is the time to fit our model, in the below code we are taking a batch size of 512 which means at one 512 input will be processed, this step may take several minutes to execute depending upon your GPU power.


hist = cae.fit(X_train, X_train,



        validation_data=(X_test, X_test),



image 6

Plotting the difference between loss in training and testing data as well as the difference between the accuracy of training and testing data with the help of matplotlib python library.




plt.plot(hist.history['loss'], 'r', label='Training')

plt.plot(hist.history['val_loss'], 'b', label='Testing')





plt.plot(hist.history['acc'], 'r', label='Training')

plt.plot(hist.history['val_acc'], 'b', label='Testing')



image 7

image 8

test = X_train[:20]

preds = cae.predict(test)

print(test.shape, preds.shape)

(20, 28, 28, 1) (20, 28, 28, 1)

Plotting the original and regenerated image side by side with the help of matplotlib library.

for i in range(test.shape[0]):





    plt.imshow(test[i].reshape((28,28)), cmap='gray')





    plt.imshow(preds[i].reshape((28,28)), cmap='gray')

image 9

We have plotted the difference between the original and regenerated image with the help of matplotlib library, from 0 to 9, we have regenerated the image using the input, with more training, the model would be able to produce almost equally refined image as an input.




With regeneration quality and an excellent dimensionality reduction capabilities, there are a lot of things that you could do with the help of this amazing advancement in the field of deep learning. 


(Also read: Fuzzy-logic approach in decision-making)


Although autoencoders are of many types like sparse autoencoder, denoising autoencoder, convolutional autoencoder, and more. What we implemented here was a convolutional autoencoder.