The TensorFlow tutorials are written as Jupyter notebooks and run directly in Google Colab—a hosted notebook environment that requires no setup. Click the Run in Google Colab button.

Colab link - Open colab

Building a simple Keras + deep learning REST API In this tutorial, we will present a simple method to take a Keras model and deploy it as a REST API.

The examples covered in this post will serve as a template/starting point for building your own deep learning APIs — you will be able to extend the code and customize it based on how scalable and robust your API endpoint needs to be.

Specifically, we will learn: How to (and how not to) load a Keras model into memory so it can be efficiently used for inference

How to use the Flask web framework to create an endpoint for our API

How to make predictions using our model, JSON-ify them, and return the results to the client

How to call our Keras REST API using both cURL and Python

By the end of this tutorial you'll have a good understanding of the components (in their simplest form) that go into a creating Keras REST API.

Feel free to use the code presented in this guide as a starting point for your own deep learning REST API.

Configuring your development environment We'll be making the assumption that Keras is already configured and installed on your machine. If not, please ensure you install Keras using the official install instructions.

From there, we'll need to install Flask (and its associated dependencies), a Python web framework, so we can build our API endpoint. We'll also need requests so we can consume our API as well.

The relevant pip install commands are listed below:


!pip install flask gevent requests pillow

CREATE A Procfile : Create a file with name Procfile and paste content below line in it into it

web: gunicorn app:app


procfile = 'web: gunicorn app:app'
procfiles= open("/content/Procfile","w")
procfiles.write(procfile)
procfiles.close()

INSTALL FLASK AND NGROK


!pip install flask-ngrok
from flask_ngrok import run_with_ngrok
from flask import Flask

Build front end: Build directory structure


!mkdir '/content/templates'


Html_file= open("/content/templates/index.html","w")
Html_file.write(a)
Html_file.close()

INSTALL FLASK AND NGROK


!pip install flask-ngrok
from flask_ngrok import run_with_ngrok
from flask import Flask

Building your Keras REST API Our Keras REST API is self-contained in a single file named run_keras_server.py. We kept the installation in a single file as a manner of simplicity — the implementation can be easily modularized as well.

Inside run_keras_server.py you'll find three functions, namely: 1. load_model: Used to load our trained Keras model and prepare it for inference.

2. prepare_image: This function preprocesses an input image prior to passing it through our network for prediction. If you are not working with image data you may want to consider changing the name to a more generic prepare_datapoint and applying any scaling/normalization you may need.

3. predict: The actual endpoint of our API that will classify the incoming data from the request and return the results to the client.

Build model


from tensorflow.python.keras import models
from tensorflow.python.keras import initializers
from tensorflow.python.keras import regularizers
import os
from tensorflow.python.keras.layers import Dense
from tensorflow.python.keras.layers import Dropout
from tensorflow.python.keras.layers import Embedding
from tensorflow.python.keras.layers import SeparableConv1D
from tensorflow.python.keras.layers import MaxPooling1D
from tensorflow.python.keras.layers import GlobalAveragePooling1D
import tensorflow as tf
import numpy as np
import random
import pandas as pd
from tensorflow.python.keras.preprocessing import sequence
from tensorflow.python.keras.preprocessing import text
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import f_classif
import pickle
# Vectorization parameters

# Range (inclusive) of n-gram sizes for tokenizing text.
NGRAM_RANGE = (1, 2)

# Limit on the number of features. We use the top 20K features.
TOP_K = 20000

# Whether text should be split into word or character n-grams.
# One of 'word', 'char'.
TOKEN_MODE = 'word'

# Minimum document/corpus frequency below which a token will be discarded.
MIN_DOCUMENT_FREQUENCY = 2

# Limit on the length of text sequences. Sequences longer than this
# will be truncated.
MAX_SEQUENCE_LENGTH = 500



def sequence_vectorize(train_texts, val_texts):
    """Vectorizes texts as sequence vectors.
    1 text = 1 sequence vector with fixed length.
    # Arguments
        train_texts: list, training text strings.
        val_texts: list, validation text strings.
    # Returns
        x_train, x_val, word_index: vectorized training and validation
            texts and word index dictionary.
    """
    # Create vocabulary with training texts.
    tokenizer = text.Tokenizer(num_words=TOP_K)
    tokenizer.fit_on_texts(train_texts)

    # Vectorize training and validation texts.
    x_train = tokenizer.texts_to_sequences(train_texts)
    x_val = tokenizer.texts_to_sequences(val_texts)

    # Get max sequence length.
    max_length = len(max(x_train, key=len))
    if max_length > MAX_SEQUENCE_LENGTH:
        max_length = MAX_SEQUENCE_LENGTH

    # Fix sequence length to max value. Sequences shorter than the length are
    # padded in the beginning and sequences longer are truncated
    # at the beginning.
    x_train = sequence.pad_sequences(x_train, maxlen=max_length)
    x_val = sequence.pad_sequences(x_val, maxlen=max_length)



    # saving
    with open('tokenizer.pickle', 'wb') as handle:
      pickle.dump(tokenizer, handle, protocol=pickle.HIGHEST_PROTOCOL)

    return x_train, x_val, tokenizer.word_index



def load_imdb_sentiment_analysis_dataset(data_path, seed=123):
    """Loads the Imdb movie reviews sentiment analysis dataset.
    # Arguments
        data_path: string, path to the data directory.
        seed: int, seed for randomizer.
    # Returns
        A tuple of training and validation data.
        Number of training samples: 25000
        Number of test samples: 25000
        Number of categories: 2 (0 - negative, 1 - positive)
    # References
        Mass et al., http://www.aclweb.org/anthology/P11-1015
        Download and uncompress archive from:
        http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
    """
    imdb_data_path = os.path.join(data_path, 'aclImdb')

    # Load the training data
    train_texts = []
    train_labels = []
    for category in ['pos', 'neg']:
        train_path = os.path.join(imdb_data_path, 'train', category)
        for fname in sorted(os.listdir(train_path)):
            if fname.endswith('.txt'):
                with open(os.path.join(train_path, fname)) as f:
                    train_texts.append(f.read())
                train_labels.append(0 if category == 'neg' else 1)

    # Load the validation data.
    test_texts = []
    test_labels = []
    for category in ['pos', 'neg']:
        test_path = os.path.join(imdb_data_path, 'test', category)
        for fname in sorted(os.listdir(test_path)):
            if fname.endswith('.txt'):
                with open(os.path.join(test_path, fname)) as f:
                    test_texts.append(f.read())
                test_labels.append(0 if category == 'neg' else 1)

    # Shuffle the training data and labels.
    random.seed(seed)
    random.shuffle(train_texts)
    random.seed(seed)
    random.shuffle(train_labels)

    return ((train_texts, np.array(train_labels)),
            (test_texts, np.array(test_labels)))

Load dataset


!wget http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz -O imdb.tar.gz
import tarfile
with tarfile.open('imdb.tar.gz', 'r:gz') as tar:
    tar.extractall()

(x_train, y_train),(x_test, y_test) =  load_imdb_sentiment_analysis_dataset("")

from sklearn.model_selection import train_test_split
seed = 7
test_size = 0.25
X_train, X_val, y_train, y_val = train_test_split(x_train, y_train, test_size=test_size, random_state=seed)

Vectorize texts.


x_train, x_val, word_index = sequence_vectorize(X_train, X_val)

Number of features will be the embedding input dimension. Add 1 for the reserved index 0.


num_features = min(len(word_index) + 1, TOP_K)

Download embedding matrix: Download and extract GloVe embeddings


!wget http://nlp.stanford.edu/data/glove.6B.zip
!unzip glove.6B.zip

def _get_last_layer_units_and_activation(num_classes):
    """Gets the # units and activation function for the last network layer.
    # Arguments
        num_classes: int, number of classes.
    # Returns
        units, activation values.
    """
    if num_classes == 2:
        activation = 'sigmoid'
        units = 1
    else:
        activation = 'softmax'
        units = num_classes
    return units, activation



def get_num_classes(labels):
    """Gets the total number of classes.
    # Arguments
        labels: list, label values.
            There should be at lease one sample for values in the
            range (0, num_classes -1)
    # Returns
        int, total number of classes.
    # Raises
        ValueError: if any label value in the range(0, num_classes - 1)
            is missing or if number of classes is <= 1.
    """
    num_classes = max(labels) + 1
    missing_classes = [i for i in range(num_classes) if i not in labels]
    if len(missing_classes):
        raise ValueError('Missing samples with label value(s) '
                         '{missing_classes}. Please make sure you have '
                         'at least one sample for every label value '
                         'in the range(0, {max_class})'.format(
                            missing_classes=missing_classes,
                            max_class=num_classes - 1))

    if num_classes <= 1:
        raise ValueError('Invalid number of labels: {num_classes}.'
                         'Please make sure there are at least two classes '
                         'of samples'.format(num_classes=num_classes))
    return num_classes



def _get_embedding_matrix(word_index, embedding_data_dir, embedding_dim):
    """Gets embedding matrix from the embedding index data.
    # Arguments
        word_index: dict, word to index map that was generated from the data.
        embedding_data_dir: string, path to the pre-training embeddings.
        embedding_dim: int, dimension of the embedding vectors.
    # Returns
        dict, word vectors for words in word_index from pre-trained embedding.
    # References:
        https://nlp.stanford.edu/projects/glove/
        Download and uncompress archive from:
        http://nlp.stanford.edu/data/glove.6B.zip
    """

    # Read the pre-trained embedding file and get word to word vector mappings.
    embedding_matrix_all = {}

    # We are using 200d GloVe embeddings.
    fname = os.path.join(embedding_data_dir, 'glove.6B.200d.txt')
    with open(fname) as f:
        for line in f:  # Every line contains word followed by the vector value
            values = line.split()
            word = values[0]
            coefs = np.asarray(values[1:], dtype='float32')
            embedding_matrix_all[word] = coefs

    # Prepare embedding matrix with just the words in our word_index dictionary
    num_words = min(len(word_index) + 1, TOP_K)
    embedding_matrix = np.zeros((num_words, embedding_dim))

    for word, i in word_index.items():
        if i >= TOP_K:
            continue
        embedding_vector = embedding_matrix_all.get(word)
        if embedding_vector is not None:
            # words not found in embedding index will be all-zeros.
            embedding_matrix[i] = embedding_vector
    return embedding_matrix

embedding_dim=200
embedding_matrix = _get_embedding_matrix(word_index, "", embedding_dim)

Use pretrained embedding or not


use_pretrained_embedding=True
is_embedding_trainable = False

Build model


from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
import time
import tensorflow as tf
import numpy as np


FLAGS = None

# Limit on the number of features. We use the top 20K features.
TOP_K = 20000



learning_rate=1e-3
epochs=1000
batch_size=128
blocks=2
filters=64
dropout_rate=0.2

kernel_size=3
pool_size=3
 
num_classes = get_num_classes(y_train)
# Create model instance.


op_units, op_activation = _get_last_layer_units_and_activation(num_classes)
model = models.Sequential()

# Add embedding layer. If pre-trained embedding is used add weights to the
# embeddings layer and set trainable to input is_embedding_trainable flag.
if use_pretrained_embedding:
        model.add(Embedding(input_dim=num_features,
                            output_dim=embedding_dim,
                            input_length=x_train.shape[1],
                            weights=[embedding_matrix],
                            trainable=is_embedding_trainable))
else:
        model.add(Embedding(input_dim=num_features,
                            output_dim=200,
                            input_length=x_train.shape[1]))

for _ in range(blocks-1):
        model.add(Dropout(rate=dropout_rate))
        model.add(SeparableConv1D(filters=filters,
                                  kernel_size=kernel_size,
                                  activation='relu',
                                  bias_initializer='random_uniform',
                                  depthwise_initializer='random_uniform',
                                  padding='same'))
        model.add(SeparableConv1D(filters=filters,
                                  kernel_size=kernel_size,
                                  activation='relu',
                                  bias_initializer='random_uniform',
                                  depthwise_initializer='random_uniform',
                                  padding='same'))
        model.add(MaxPooling1D(pool_size=pool_size))

model.add(SeparableConv1D(filters=filters * 2,
                              kernel_size=kernel_size,
                              activation='relu',
                              bias_initializer='random_uniform',
                              depthwise_initializer='random_uniform',
                              padding='same'))
model.add(SeparableConv1D(filters=filters * 2,
                              kernel_size=kernel_size,
                              activation='relu',
                              bias_initializer='random_uniform',
                              depthwise_initializer='random_uniform',
                              padding='same'))
model.add(GlobalAveragePooling1D())
model.add(Dropout(rate=dropout_rate))
model.add(Dense(op_units, activation=op_activation))

Compile model with learning parameters: Create callback for early stopping on validation loss. If the loss does not decrease in two consecutive tries, stop training.


if num_classes == 2:
        loss = 'binary_crossentropy'
else:
        loss = 'sparse_categorical_crossentropy'
optimizer = tf.keras.optimizers.Adam(lr=learning_rate)
model.compile(optimizer=optimizer, loss=loss, metrics=['acc'])

callbacks = [tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=2)]

Train and validate model.


history = model.fit(
            x_train,
            y_train,
            epochs=epochs,
            callbacks=callbacks,
            validation_data=(x_val, y_val),
            verbose=2,  # Logs once per epoch.
            batch_size=batch_size)

Save model


model.save("my_model")

app.py


import numpy as np
from flask import Flask, request, jsonify, render_template
import pickle
from tensorflow.python.keras import models
from tensorflow.python.keras import initializers
from tensorflow.python.keras import regularizers
import os
import pickle
from tensorflow.python.keras.layers import Dense
from tensorflow.python.keras.layers import Dropout
from tensorflow.python.keras.layers import Embedding
from tensorflow.python.keras.layers import SeparableConv1D
from tensorflow.python.keras.layers import MaxPooling1D
from tensorflow.python.keras.layers import GlobalAveragePooling1D
import tensorflow as tf
import numpy as np
import random
import pandas as pd
from tensorflow.python.keras.preprocessing import sequence
from tensorflow.python.keras.preprocessing import text
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import f_classif


app = Flask(__name__)
run_with_ngrok(app)   #starts ngrok when the app is run



def load_model():
    global tokenizers
    tokenizers = pickle.load(open('/content/tokenizer.pickle', 'rb'))
    global reconstructed_model
    reconstructed_model = models.load_model("my_model")

@app.route('/')
def home():
    return render_template('index.html')


def preprocess(train_texts):
    tokenizers.fit_on_texts(train_texts)
    x = tokenizers.texts_to_sequences(train_texts)    
    x = sequence.pad_sequences(x, maxlen=MAX_SEQUENCE_LENGTH)  
    return x



@app.route('/predict',methods=['POST','GET'])
def predict():

    data = request.form['comment']
    features = [data]
    final_features = preprocess(features)
    prediction = reconstructed_model.predict(final_features)

    if prediction <= 0.5:
      output = "Negative"
    else:
      output = "Positive"

    return render_template('index.html', label='Sentiment is $ {}'.format(output))



if __name__ == "__main__":
    load_model()
    app.run()

TensorFlow tutorials - Building Flask app for sentiment analysis