Introduction.

In this article, I will guide you through the processes I went through to create an image classifier using the Flask framework, the Keras library and Google’s popular machine learning library, Tensorflow. It was quite an interesting project that managed to gain over nineteen stars on Github, as at the time of writing this article. I was really pleased with the project, thus I humbly decided to share the entire workflow.

My System Specifications.

I love to classify my system specifications as minimalistic to engage in training neural networks and machine learning models. I use a hewlett-packard machine that runs the Linux Operating System, specifically Ubuntu, which is a fantastic debian-based distribution that I cannot just abandon for other alternatives. Below are my specifications:

Intel Core i7 (10th Generation).
Nvidia GeForce, 2GB GPU.
8GB RAM.
250GB SSD plus 1TB of HHD.

The Importance of GPUS.

The Graphic Processing Unit, abbreviated as GPU, is a unique electronic circuit that offers parallel processing and is used to accelerate the rendering of computer graphics. GPUs are implemented within the deep learning space to to accelerate training and inference for resource and computationally intensive models. Their importance cannot be overemphasized as they contain specialized cores unlike a CPU that distribute the workload or computation across each unit to deliver a much faster performance. So I really recommend using a system with GPU to train models. I had to destroy my wallet to get a GPU machine, but it was worth it.

I suggest you purchase some cloud based solutions that would give you access into a cluster or instance of compute that provides adequate GPU that would suffice for a specific task. There are several cloud providers available for this purpose.

Getting Started.

First and foremost, we need to satisfy some requirements otherwise the installation of certain dependencies that are pertinet to building the platform. Hence, we must review the context of the project and understand the reason for the selected tools for the project.

Tensorflow: Tensorflow is Google’s open source machine learning library used internally as well for their Machine Learning driven solutions.
Keras: This is a python based interface that acts as a layer on top of TensorFlow, to ease the abstraction and improve the implementation.
Flask: This is a fantastic microframework used for building web applications written in Python. It has always been my preferred approach to developing scalable RESTful APIs in Python, over Django.
Pip: This is the official package manager for Python which provides the above libraries hosted on Python’s open source library platform.

The Model.

One significant thing to note is that we are not explicitly writing a machine learning model from scratch. That would be quite exhausting. Why reinvent the wheel when you have already trained models available to integrate? Hell no.

So for our prestigious model, which I cherry-picked from several options and sources. It is called the “VGG16.”

It sounds like a rocket model crafted for another adventure into the solar system. Or a codename for a russian spy. As ludicrous as it may seem, the name of the model played an agurably important role in grabbing my attention.

The VGG16 Model.

I truly believe in the importance of reviewing academic papers that cite or basically illustrate the entire context of the model available. Therefore, I was able to obtain some pretty cool sources. I suggest this paper. It was very intriguing

VGG16 refers to the VGG model, also called VGGNet. It is a convolution neural network (CNN) model supporting 16 layers. So basically, the VGG16 model is a convolution neural network designed primarily for the sole purpose of localization and classification. It supports up to sixteen perceptron layers. Neural networks are made up of components called layers.

Covolutional Neural Network

Image Credit: The Click Reader

For instance the covolutional neural network consists of three layers, primarily the pooling layer, activation layer and covolutional layer, with these three layers containing a speficic amount of weighted input that can be received. I do not want to go over neural networks here, I have a seperate article for that in theory. So I may appear quite evasive of the concepts I may subconciously instigate possibly as a means of clarifications to solidify my propositions and explanations.

So let us do more fun stuff. Try and capture the below diagramatical represenation of the VGG16 Model.

The VGG15 Model Architecture.

Structure

The VGG16 is composed of \(13\) convolutional layers, \(5\) max-pooling layers, and \(3\) fully connected layers. The number of layers with tunable parameters sums up to a total of \(16\). Giving it the number preceding the name.

The number of filters in the first block as shown is \(64\). The number is doubled in the subsequent blocks until it gets to \(512\). The model ends with two connected hidden layers, each containing \(4096\) neurons. The output layer contains a total of \(1000\) neurons corresponding to the number of categories in the ImageNet dataset.

We would implement the following algorithm in the project. It is worthy to note that the source code was obtained from the Keras Team. The reference to the same paper I cited is coincidentally referenced in the code.

"""VGG16 model for Keras.

# Reference

- [Very Deep Convolutional Networks for Large-Scale Image Recognition](
    https://arxiv.org/abs/1409.1556) (ICLR 2015)

"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os

from . import get_submodules_from_kwargs
from . import imagenet_utils
from .imagenet_utils import decode_predictions
from .imagenet_utils import _obtain_input_shape

preprocess_input = imagenet_utils.preprocess_input

WEIGHTS_PATH = ('https://github.com/fchollet/deep-learning-models/'
                'releases/download/v0.1/'
                'vgg16_weights_tf_dim_ordering_tf_kernels.h5')
WEIGHTS_PATH_NO_TOP = ('https://github.com/fchollet/deep-learning-models/'
                       'releases/download/v0.1/'
                       'vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5')


def VGG16(include_top=True,
          weights='imagenet',
          input_tensor=None,
          input_shape=None,
          pooling=None,
          classes=1000,
          **kwargs):
    """Instantiates the VGG16 architecture.

    Optionally loads weights pre-trained on ImageNet.
    Note that the data format convention used by the model is
    the one specified in your Keras config at `~/.keras/keras.json`.

    # Arguments
        include_top: whether to include the 3 fully-connected
            layers at the top of the network.
        weights: one of `None` (random initialization),
              'imagenet' (pre-training on ImageNet),
              or the path to the weights file to be loaded.
        input_tensor: optional Keras tensor
            (i.e. output of `layers.Input()`)
            to use as image input for the model.
        input_shape: optional shape tuple, only to be specified
            if `include_top` is False (otherwise the input shape
            has to be `(224, 224, 3)`
            (with `channels_last` data format)
            or `(3, 224, 224)` (with `channels_first` data format).
            It should have exactly 3 input channels,
            and width and height should be no smaller than 32.
            E.g. `(200, 200, 3)` would be one valid value.
        pooling: Optional pooling mode for feature extraction
            when `include_top` is `False`.
            - `None` means that the output of the model will be
                the 4D tensor output of the
                last convolutional block.
            - `avg` means that global average pooling
                will be applied to the output of the
                last convolutional block, and thus
                the output of the model will be a 2D tensor.
            - `max` means that global max pooling will
                be applied.
        classes: optional number of classes to classify images
            into, only to be specified if `include_top` is True, and
            if no `weights` argument is specified.

    # Returns
        A Keras model instance.

    # Raises
        ValueError: in case of invalid argument for `weights`,
            or invalid input shape.
    """
    backend, layers, models, keras_utils = get_submodules_from_kwargs(kwargs)

    if not (weights in {'imagenet', None} or os.path.exists(weights)):
        raise ValueError('The `weights` argument should be either '
                         '`None` (random initialization), `imagenet` '
                         '(pre-training on ImageNet), '
                         'or the path to the weights file to be loaded.')

    if weights == 'imagenet' and include_top and classes != 1000:
        raise ValueError('If using `weights` as `"imagenet"` with `include_top`'
                         ' as true, `classes` should be 1000')
    # Determine proper input shape
    input_shape = _obtain_input_shape(input_shape,
                                      default_size=224,
                                      min_size=32,
                                      data_format=backend.image_data_format(),
                                      require_flatten=include_top,
                                      weights=weights)

    if input_tensor is None:
        img_input = layers.Input(shape=input_shape)
    else:
        if not backend.is_keras_tensor(input_tensor):
            img_input = layers.Input(tensor=input_tensor, shape=input_shape)
        else:
            img_input = input_tensor
    # Block 1
    x = layers.Conv2D(64, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block1_conv1')(img_input)
    x = layers.Conv2D(64, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block1_conv2')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)

    # Block 2
    x = layers.Conv2D(128, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block2_conv1')(x)
    x = layers.Conv2D(128, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block2_conv2')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)

    # Block 3
    x = layers.Conv2D(256, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block3_conv1')(x)
    x = layers.Conv2D(256, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block3_conv2')(x)
    x = layers.Conv2D(256, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block3_conv3')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)

    # Block 4
    x = layers.Conv2D(512, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block4_conv1')(x)
    x = layers.Conv2D(512, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block4_conv2')(x)
    x = layers.Conv2D(512, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block4_conv3')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)

    # Block 5
    x = layers.Conv2D(512, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block5_conv1')(x)
    x = layers.Conv2D(512, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block5_conv2')(x)
    x = layers.Conv2D(512, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block5_conv3')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)

    if include_top:
        # Classification block
        x = layers.Flatten(name='flatten')(x)
        x = layers.Dense(4096, activation='relu', name='fc1')(x)
        x = layers.Dense(4096, activation='relu', name='fc2')(x)
        x = layers.Dense(classes, activation='softmax', name='predictions')(x)
    else:
        if pooling == 'avg':
            x = layers.GlobalAveragePooling2D()(x)
        elif pooling == 'max':
            x = layers.GlobalMaxPooling2D()(x)

    # Ensure that the model takes into account
    # any potential predecessors of `input_tensor`.
    if input_tensor is not None:
        inputs = keras_utils.get_source_inputs(input_tensor)
    else:
        inputs = img_input
    # Create model.
    model = models.Model(inputs, x, name='vgg16')

    # Load weights.
    if weights == 'imagenet':
        if include_top:
            weights_path = keras_utils.get_file(
                'vgg16_weights_tf_dim_ordering_tf_kernels.h5',
                WEIGHTS_PATH,
                cache_subdir='models',
                file_hash='64373286793e3c8b2b4e3219cbf3544b')
        else:
            weights_path = keras_utils.get_file(
                'vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5',
                WEIGHTS_PATH_NO_TOP,
                cache_subdir='models',
                file_hash='6d6bbae143d832006294945121d1f1fc')
        model.load_weights(weights_path)
        if backend.backend() == 'theano':
            keras_utils.convert_all_kernels_in_model(model)
    elif weights is not None:
        model.load_weights(weights)

    return model

Building the project

Now we have an overview, we can commence the development of the image classifier, finally. The first thing I suggest you do is create a folder for the project.

sudo mkdir image-classifier
cd image-classifier

Create a virtual environment. This is relatively simple to do.

# Install on Ubuntu
sudo apt-get install python3-venv
python3 -m venv .venv
source .venv/bin/activate

Please ensure you fulfilled the requirements above. Ensure you have the Python programming language installed. Now we would neet to set up our dependencies.

sudo apt install pip3
python -m pip3 install --upgrade pip
pip3 install keras, flask, tensorflow

Installing Dependencies

Create a file named app.py in the directory you created. This would be the main file that Flask will startup. Then append the following line of code to the app.py file:

from flask import Flask
app = Flask(__name__)
@app.route("/")
def home():
    return "Hello, Flask!"

Then run the following command: python -m flask run to boot up the application.

Creating the HTML template.

The application would need a front-end to run in the browser. Therefore to achieve this, we would need to create a template in HTML. You should be able to do this by first creating a folder called "template".

sudo mkdir template
touch app.html
sudo vi app.html

Then append the following code for the HTML template. We are using some Bootstrap for simple styling served through a content delivery network.

<!-- The Template: Gerald Maduabuchi -->
<!DOCTYPE html>
<html lang="en">
 <head>
  <title>Python Image Classification | Flask</title>
  <meta charset="utf-8" />
  <link rel="stylesheet" href="app.css" />
  <link
   rel="stylesheet"
   href="https://cdn.jsdelivr.net/npm/bootstrap@3.3.7/dist/css/bootstrap.min.css"
   integrity="sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u"
   crossorigin="anonymous"
  />
 </head>
 <body>
  <nav class="navbar navbar-default navbar-fixed-top nav">
   <a class="navbar-brand" href="app.html" aria-label="Go to home section"
    >Python Image Classification</a
   >
  </nav>
 </body>
 <form
  action="/"
  method="POST"
  enctype="multipart/form-data"
  class="form-style"
 >
  <div class="form-group">
   <label for="file image">Enter your Image</label>
   <input
    type="file"
    class="form-control"
    name="fileImage"
    placeholder="enter image"
   /><br />
   <input type="submit" value="Predict" class="btn btn-success" />
  </div>
 </form>
</html>

We created a form with a POST method to handle the submission of the image as an input from the browser to Flask, which would then leverage on the power of the VGG16 model to classify the image and return a prediction score. We would later implement the latter as we proceed.

Styling the Template.

For some extra styling, we would create an css/app.css then write the following styling otherwise rulesets to make our site look less ugly. I am still in the process of improving my front-end development skills, so bear with me the atrocious interface you may behold. It may be as generic as you think.

.form-style {
 padding-top: 25rem;
 width: 50rem;
 margin-left: 50rem;
}
input[type='file'] {
 color: red;
 padding-bottom: 3rem;
}

Model Implementation in Flask.

In your app.py we are going to write the basic logic, otherwise implementations of the VGG16 model in our application. It is quite straightforward, and does not in anyway represent the level of complexity we covered in the introductory section of the article.

# Importing flask module

from flask import Flask, render_template, request

First and foremost, you need to import the flask module which you installed by means of the pip tooling. Including the render_template to render the HTML template we wrote, as well as the request to retrieve requests from the client of our application.

Below, we carefully import the modules required for processing and the VGG16 model into the flask application. I will carefully elaborate on each import and its own contribution to the functionality.

The img_to_array module is for converting a loaded image in PIL format into a NumPy array.
The load_img, as the name implies is to load an image into the model.
The preprocess_input is meant to adequate the image to a format that the VGG16 requires.
The VGG16 module is the library or implementation itself that we would call into the application.
The decode_predictions is used to provide the prediction accuracy score, otherwise estimate.

# Importing Keras Modules

from keras.preprocessing.image import img_to_array
from keras.preprocessing.image import load_img
from keras.applications.vgg16 import preprocess_input
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import decode_predictions

Now we need to write out new methods to start the application. We would also assign the VGG16 instansiation, to the variable model to represent the initialization across our codebase.

# Start App
app = Flask(__name__)

#The Model.
model = VGG16()

Next we would assign a sample route again, and write a function to render the app.html template accordingly. This could be done effectively by writing the below code:

# Get Route
@app.route('/', methods=['GET'])
def hello_world():
    #Render Template
    return render_template('app.html')

Now, this is where the entire logic is written. We would need to receive inputs from the post into the model.

# Receive input and post to model
@app.route('/', methods=['POST'])

# Predict function
def predict():
    fileImage = request.files['fileImage']
    image_path = "./images/" + fileImage.filename
    fileImage.save(image_path)

Here we request the image path, within the predict function we have written.

Loading the image.

Here we load the image into our application by calling the load_img method imported, passing the image_path as a parameter and the target size to conform to the requirements of the model.

#Load Image
    image = load_img(image_path, target_size=(224, 224))
    image = img_to_array(image)
    image = image.reshape((1, image.shape[0], image.shape[1], image.shape[2]))

Classification and Prediction.

Now we execute the prediction, and classification of the input. Then we return the prediction score to the app.html template.

    #Process Image
    image = preprocess_input(image)

    #Make Predictions
    yhat = model.predict(image)
    label = decode_predictions(yhat)
    label = label[0][0]

    #Make Classifications
    classification = '%s (%.2f%%)' % (label[1], label[2]*100)

    return render_template('app.html', prediction=classification)

Integrating to the Template.

Since we are using flask, we would need to call the prediction method by the following convention into our html template to fully integrate it.

<!-- Predict Function -->
{% if prediction %}
<small>The image you uploaded is a {{prediction}}</small>
{% endif %}

Excellent. I forgot. It is required to write in the initialization method for the flask application and assign a port number like so:

# Intitalise Port
if __name__ == '__main__':
    app.run(port=3000, debug=False)

Run python -m flask run to start the application.

Conclusion.

Did you get to the end solider? If you did, congratulations. You have successfully created a flask application that executes image recognition. Till the next time we meet. Have a nice day. Share the article and also, I am open to any suggestions. Also, feel free to explore the repository if it does not seem to work for you the way you want it to, follow the instructions and then you should be all set. Regards.

‹ Gerald's Journal

Building An Image Classifier

Jun 16, 2022