Source: Deep Learning on Medium


Go to the profile of Diogo Ribeiro

Deep learning is pattern recognition via so-called neural networks. Neural networks are a set of algorithms, modeled after the human brain. They are sensors: a form of machine perception. Deep learning is a name for a certain type of stacked neural network composed of several node layers. Each layer’s output is simultaneously the subsequent layer’s input, starting from an initial input layer.

Deep-learning networks are distinguished from the more commonplace single-hidden-layer neural networks by their depth; that is, the number of node layers through which data is passed in a multistep process of pattern recognition. Three or more including input and output is deep learning. Anything less is simply machine learning.

Deep learning is motivated by intuition, theoretical arguments from circuit theory, empirical results, and current knowledge of neuroscience.

  • The main concept in deep learning algorithms is automating the extraction of representations (abstractions) from the data.
  • A key concept underlying Deep Learning methods is distributed representations of the data, in which a large number of possible configurations of the abstract features of the input data is feasible, allowing for a compact representation of each sample and leading to a richer generalization.
  • Deep learning algorithms lead to abstract representations because more abstract representations are often constructed based on less abstract ones.An important advantage of more abstract representations is that they can be invariant to the local changes in the input data.
  • Deep learning algorithms are actually Deep architectures of consecutive layers.
  • Stacking up the nonlinear transformation layers is the basic idea in deep learning algorithms.
  • It is important to note that the transformations in the layers of deep architecture are non-linear transformations which try to extract underlying explanatory factors in the data.
  • The final representation of data constructed by the deep learning algorithm (output of the final layer) provides useful information from the data which can be used as features in building classifiers, or even can be used for data indexing and other applications which are more efficient when using abstract representations of data rather than high dimensional sensory data.

Let’s understand in layman’s terms-

Imagine you’re building a shopping recommendation engine, and you discover that if an item is trending and a user has browsed the category of that item in the last day, they are very likely to buy the trending item.

These two variables are so accurate together that you can combine them into a new single variable, or feature (Call it “interested_in_trending_category”, for example).

Finding connections between variables and packaging them into a new single variable is called feature engineering
Deep learning is an automated feature engineering.

Autoencoders are neural networks models that encode and decode data where the output should look like the input. Autoencoders have wide applications in tasks such as dimensionality reduction, generative models and feature extraction in machine learning. The greatest impact of autoencoders has been experienced in the field of computer vision and natural language processing. In the previous post we looked at what the autoencoders are, various types, there strengths, challenges and applications. In this post we are going to develop a simple autoencoder with Keras to recognize digits using the MNIST data set.

Simple Autoencoder with Keras

Autoencoders can be implemented with different tools such as TensorFlow, Keras, Theano, PyTorch among other great tools. In this post we are going to use Keras framework with the TensorFlow back-end. Autoencoder is nothing but an artificial neural network with two parts; an encoder and a decoder part. The encoder compresses the input data while the decoder uncompresses the encoded data back to the original format. The objective here is to train the model to be able to reproduce the output which looks like the input.

A simple autoencoder is a neural network made up of three layers; the input layer, one hidden layer and an output layer. However, autoencoders can be stacked to form deep autoencoder that can learn better representations.

Let’s implement our simple three-layer neural network autoencoder and train it on the MNIST data set.

Import required libraries

# Load libraries
import numpy as np
import matplotlib.pyplot as plt
from keras.layers import Input,Dense
from keras.models import Model,Sequential
from keras.datasets import mnist

Load MNIST data

# Load data
(X_train,_), (X_test,_)=mnist.load_data()

Scaling our data

#Scaling the data to 0 and 1
X_train=X_train.astype(‘float32’)/float(X_train.max())
X_test=X_test.astype(‘float32’)/float(X_test.max())

Let’s inspect our data set

#Inspect our data (The training data has 60K images while testing data has 10K images. All the sets have resolution of 28x28)
print(“Training set : “,X_train.shape)
print(“Testing set : “,X_test.shape)
Output
Training set : (60000, 28, 28)
Testing set : (10000, 28, 28)

Reshaping our images data into vectors of length 784

# Reshaping our images into matrices
X_train=X_train.reshape((len(X_train),np.prod(X_train.shape[1:])))
X_test=X_test.reshape((len(X_test),np.prod(X_test.shape[1:])))
print(“Training set : “,X_train.shape) #The resolution has changed
print(“Testing set : “,X_test.shape)
Output
Training set : (60000, 784)
Testing set : (10000, 784)

Creating our autoencoder model

input_dim=X_train.shape[1]
encoding_dim=32
compression_factor=float(input_dim/encoding_dim)
autoencoder=Sequential()
autoencoder.add(Dense(encoding_dim, input_shape=(input_dim,),activation=’relu’))
autoencoder.add(Dense(input_dim,activation=’sigmoid’))
input_img=Input(shape=(input_dim,))
encoder_layer=autoencoder.layers[0]
encoder=Model(input_img,encoder_layer(input_img))
autoencoder.compile(optimizer=’adam’, loss=’binary_crossentropy’)
autoencoder.fit(X_train,X_train,epochs=50, batch_size=256, shuffle=True, validation_data=(X_test,X_test))
Output
Train on 60000 samples, validate on 10000 samples
Epoch 1/50
60000/60000 [==============================] — 7s 121us/step — loss: 0.2753 — val_loss: 0.1905
Epoch 2/50
60000/60000 [==============================] — 5s 76us/step — loss: 0.1707 — val_loss: 0.1535
Epoch 3/50
60000/60000 [==============================] — 4s 66us/step — loss: 0.1450 — val_loss: 0.1346
Epoch 4/50
60000/60000 [==============================] — 4s 66us/step — loss: 0.1295 — val_loss: 0.1219
Epoch 5/50
60000/60000 [==============================] — 4s 64us/step — loss: 0.1192 — val_loss: 0.1139
Epoch 6/50
60000/60000 [==============================] — 4s 64us/step — loss: 0.1126 — val_loss: 0.1087
Epoch 7/50
60000/60000 [==============================] — 4s 72us/step — loss: 0.1081 — val_loss: 0.1050
Epoch 8/50
.
.
.
.
.
.
.
.
.
60000/60000 [==============================] — 5s 86us/step — loss: 0.0940 — val_loss: 0.0928
Epoch 46/50
60000/60000 [==============================] — 5s 82us/step — loss: 0.0939 — val_loss: 0.0929
Epoch 47/50
60000/60000 [==============================] — 5s 85us/step — loss: 0.0939 — val_loss: 0.0928
Epoch 48/50
60000/60000 [==============================] — 5s 78us/step — loss: 0.0939 — val_loss: 0.0927
Epoch 49/50
60000/60000 [==============================] — 5s 81us/step — loss: 0.0939 — val_loss: 0.0929
Epoch 50/50
60000/60000 [==============================] — 5s 80us/step — loss: 0.0939 — val_loss: 0.0928

Making Prediction on Test data

# Test images and prediction
num_images=10
np.random.seed(42)
random_test_images=np.random.randint(X_test.shape[0], size=num_images)
encoded_img=encoder.predict(X_test)
decoded_img=autoencoder.predict(X_test)

Visualizing model predictions

# Display the images and predictions
plt.figure(figsize=(18,4))
for i, image_idx in enumerate(random_test_images):
#plot input image
ax=plt.subplot(3,num_images,i+1)
plt.imshow(X_test[image_idx].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
# plot encoded image
ax = plt.subplot(3, num_images, num_images + i + 1)
plt.imshow(encoded_img[image_idx].reshape(8, 4))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
# plot reconstructed image
ax = plt.subplot(3, num_images, 2*num_images + i + 1)
plt.imshow(decoded_img[image_idx].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()

Output

Complete Code

# Load libraries
import numpy as np
import matplotlib.pyplot as plt
from keras.layers import Input,Dense
from keras.models import Model,Sequential
from keras.datasets import mnist
# Load data
(X_train,_), (X_test,_)=mnist.load_data()
#Scaling the data to 0 and 1
X_train=X_train.astype(‘float32’)/float(X_train.max())
X_test=X_test.astype(‘float32’)/float(X_test.max())
#Inspect our data (The training data has 60K images while testing data has 10K images. All the sets have resolution of 28x28)
print(“Training set : “,X_train.shape)
print(“Testing set : “,X_test.shape)
#Reshaping our images into matrices
X_train=X_train.reshape((len(X_train),np.prod(X_train.shape[1:])))
X_test=X_test.reshape((len(X_test),np.prod(X_test.shape[1:])))
print(“Training set : “,X_train.shape) #The resolution has changed
print(“Testing set : “,X_test.shape)
# Creating an autoencoder model
input_dim=X_train.shape[1]
encoding_dim=32
compression_factor=float(input_dim/encoding_dim)
autoencoder=Sequential()
autoencoder.add(Dense(encoding_dim, input_shape=(input_dim,),activation=’relu’))
autoencoder.add(Dense(input_dim,activation=’sigmoid’))
input_img=Input(shape=(input_dim,))
encoder_layer=autoencoder.layers[0]
encoder=Model(input_img,encoder_layer(input_img))
autoencoder.compile(optimizer=’adam’, loss=’binary_crossentropy’)
autoencoder.fit(X_train,X_train,epochs=50, batch_size=256, shuffle=True, validation_data=(X_test,X_test))
# Test images and prediction
num_images=10
np.random.seed(42)
random_test_images=np.random.randint(X_test.shape[0], size=num_images)
encoded_img=encoder.predict(X_test)
decoded_img=autoencoder.predict(X_test)
# Display the images
plt.figure(figsize=(18,4))
for i, image_idx in enumerate(random_test_images):
#plot input image
ax=plt.subplot(3,num_images,i+1)
plt.imshow(X_test[image_idx].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
# plot encoded image
ax = plt.subplot(3, num_images, num_images + i + 1)
plt.imshow(encoded_img[image_idx].reshape(8, 4))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
# plot reconstructed image
ax = plt.subplot(3, num_images, 2*num_images + i + 1)
plt.imshow(decoded_img[image_idx].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()

Conclusion

Autoencoders have pushed the limit of deep learning further with there great power to learn important features in data. Most complex deep learning models have incorporated autoencoders in one or another way. We have different types of autoencoders such as densoising, sparcial, variational and convolutional autoencoders that perform different tasks. In this post we have simply scratched the surface of a wider class of neural network algorithms.