Building a simple neural network in 1, 2, 3 – for fashion!
In this blog post, we will follow up our previous entry where we demonstrated a series of steps to set-up a functional AI environment in our pc’s. As mentioned in Setting a Deep Learning Working Environment, we will make use of the Keras Deep Learning Library.
I am a big fan of Keras. It is a high level, very powerful, intuitive and easy to use Python library for building neural networks and implementing deep learning architectures. We will run it on top of TensorFlow (which was also installed if you follow the instructions of the previous post!). One of the focus of Keras’ development team is to enable fast experimentation so that data scientists can move fast from the ideas to results. In short, one can use Keras if one is looking for a library that:
- Allows for easy and fast prototyping (through user friendliness, modularity, and extensibility).
- Supports both convolutional networks and recurrent networks, as well as combinations of the two.
- Runs seamlessly on CPU and GPU.
If you are interested and want more information about Keras, you can visit their homepage.
In the remainder of this blog, we will demonstrate how to create the “Hello World” of deep learning. We will create our very first neural network model in Python using Keras. We will work classifying the Fashion-MNIST dataset using a Convolutional Neural Network architecture. The Fashion-MNIST is a dataset coming from Zalando and consists of 70.000 labeled images (60.000 + 10.000 split). The images are article images containing 10 classes:
- Ankle boot
If you are interested in knowing more about this data set you can see more details in their GitHub page.
Our objective will be to, in a few lines of code, define and train a model that is able to classify the dataset with an acceptable accuracy (acceptable for a “Hello World” level tutorial!) without spending much time optimizing the model. We will not focus on reviewing the mathematical details of how a deep neural network (nor a convolutional neural network) works. This is not to say that these details are not important, on the contrary, they are so important that we will review them in a future post.
Using the Keras library will simplify the code we will use in the tutorial. Even though the code will be fairly simple, we are going to provide a good level of detail so that you can modify the code and maybe use it to create models for your own dataset in the future. We will divide the tutorial in the following sections:
- Load data
- Define model
- Compile model
- Fit model
- Evaluate model
- Visualize predictions
After following the instructions in Setting a deep learning working environment, we will not need to worry about any configuration or dependency needed. We will start getting our hands dirty right the way!
Load the data
Let’s start by importing our dataset. After this we will do some pre-processing to prepare the data for the training, validation and test stages. To load the fashion-mnist dataset, we will use Keras, and just one line of code. Then we will load the train and test dataset. Before we do this, we will import a couple of important libraries as well:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
# input image dimensions
img_rows, img_cols = 28, 28
# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()
What does the data we just loaded look like? Let’s first see the shape of the data set:
As you can see we have 60.000 training images that are 28x28 pixels together with the corresponding 60.000 labels for them.
A nice thing about working with JupyterLab is its visualization capabilities. Let’s see what the images in the fashion dataset look like:
columns = 4
rows = 5
for i in range(1, columns*rows +1):
img = x_train[i]
fig.add_subplot(rows, columns, i)
You should be able see something like this:
Before we move define our model let us prepare the data a little bit more by normalizing it,
if K.image_data_format() == 'channels_first':
x_train = x_train.reshape(x_train.shape, 1, img_rows, img_cols)
x_test = x_test.reshape(x_test.shape, 1, img_rows, img_cols)
input_shape = (1, img_rows, img_cols)
x_train = x_train.reshape(x_train.shape, img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape, img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
We are ready to move on and start defining our model.
Define the model
Let us define the model and start training it. We have a couple of options to define a model with Keras:
- Sequential model API
- Functional API
We will make use of the Sequential model API in this tutorial. As you will see, we only need to define the shape of the input data in the first layer. In the last layer, we will have a dense layer with softmax activation so we can classify the 10 categories we have in our dataset.
num_classes = 10
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv2D(32, kernel_size=(3, 3),
model.add(tf.keras.layers.Conv2D(64, (3, 3), activation='relu'))
With this, we are ready to compile the model in the next step.
Compile the model
This is the step where we will use model.compile() to configure the learning process (before the actual training). Here we can define the loss function, the optimizer we wish to use and the metrics for the evaluation of the model.
Train the model
Now, we train the model with a batch size of 128 and for 20 epochs.
batch_size = 128
epochs = 12
And just like that, we get an accuracy of over 90%. You can test it with the following:
score = model.evaluate(x_test, y_test, verbose=0)
print('\n', 'Validation accuracy:', score)
Visualize the predictions
Finally, we will visualize some of the predictions that our trained model performs. For this we can use model.predict(data, batch_size=1). In the last visualization of this post, we will see two colors for the image labels. If we see a red label, it means that the prediction is not matching the ground truth (the true label).
As you might have perceived, we presented a tutorial as a recipe. We did not dig into the details of how the algorithms work. That we will leave for our next blog.