• The TensorFlow tutorials are written as Jupyter notebooks and run directly in Google Colab—a hosted notebook environment that requires no setup. Click the Run in Google Colab button.


  • Colab link - Open colab


  • Building a simple Keras + deep learning REST API In this tutorial, we will present a simple method to take a Keras model and deploy it as a REST API.


  • The examples covered in this post will serve as a template/starting point for building your own deep learning APIs — you will be able to extend the code and customize it based on how scalable and robust your API endpoint needs to be.


  • Specifically, we will learn: How to (and how not to) load a Keras model into memory so it can be efficiently used for inference


  • How to use the Flask web framework to create an endpoint for our API


  • How to make predictions using our model, JSON-ify them, and return the results to the client


  • How to call our Keras REST API using both cURL and Python


  • By the end of this tutorial you'll have a good understanding of the components (in their simplest form) that go into a creating Keras REST API.


  • Feel free to use the code presented in this guide as a starting point for your own deep learning REST API.


  • Configuring your development environment We'll be making the assumption that Keras is already configured and installed on your machine. If not, please ensure you install Keras using the official install instructions.


  • From there, we'll need to install Flask (and its associated dependencies), a Python web framework, so we can build our API endpoint. We'll also need requests so we can consume our API as well.


  • The relevant pip install commands are listed below:


  • 
    !pip install flask gevent requests pillow
    
         
    
  • CREATE A Procfile : Create a file with name Procfile and paste content below line in it into it


  • web: gunicorn app:app


  • 
    procfile = 'web: gunicorn app:app'
    procfiles= open("/content/Procfile","w")
    procfiles.write(procfile)
    procfiles.close()
    
         
    
  • INSTALL FLASK AND NGROK


  • 
    !pip install flask-ngrok
    from flask_ngrok import run_with_ngrok
    from flask import Flask
    
         
    
  • Build front end: Build directory structure


  • 
    !mkdir '/content/templates'
    
    
    Html_file= open("/content/templates/index.html","w")
    Html_file.write(a)
    Html_file.close()
    
         
    
  • INSTALL FLASK AND NGROK


  • 
    !pip install flask-ngrok
    from flask_ngrok import run_with_ngrok
    from flask import Flask
    
    
         
    
  • Building your Keras REST API Our Keras REST API is self-contained in a single file named run_keras_server.py. We kept the installation in a single file as a manner of simplicity — the implementation can be easily modularized as well.


  • Inside run_keras_server.py you'll find three functions, namely: 1. load_model: Used to load our trained Keras model and prepare it for inference.


  • 2. prepare_image: This function preprocesses an input image prior to passing it through our network for prediction. If you are not working with image data you may want to consider changing the name to a more generic prepare_datapoint and applying any scaling/normalization you may need.


  • 3. predict: The actual endpoint of our API that will classify the incoming data from the request and return the results to the client.


  • Build model


  • 
    from tensorflow.python.keras import models
    from tensorflow.python.keras import initializers
    from tensorflow.python.keras import regularizers
    import os
    from tensorflow.python.keras.layers import Dense
    from tensorflow.python.keras.layers import Dropout
    from tensorflow.python.keras.layers import Embedding
    from tensorflow.python.keras.layers import SeparableConv1D
    from tensorflow.python.keras.layers import MaxPooling1D
    from tensorflow.python.keras.layers import GlobalAveragePooling1D
    import tensorflow as tf
    import numpy as np
    import random
    import pandas as pd
    from tensorflow.python.keras.preprocessing import sequence
    from tensorflow.python.keras.preprocessing import text
    from sklearn.feature_extraction.text import TfidfVectorizer
    from sklearn.feature_selection import SelectKBest
    from sklearn.feature_selection import f_classif
    import pickle
    # Vectorization parameters
    
    # Range (inclusive) of n-gram sizes for tokenizing text.
    NGRAM_RANGE = (1, 2)
    
    # Limit on the number of features. We use the top 20K features.
    TOP_K = 20000
    
    # Whether text should be split into word or character n-grams.
    # One of 'word', 'char'.
    TOKEN_MODE = 'word'
    
    # Minimum document/corpus frequency below which a token will be discarded.
    MIN_DOCUMENT_FREQUENCY = 2
    
    # Limit on the length of text sequences. Sequences longer than this
    # will be truncated.
    MAX_SEQUENCE_LENGTH = 500
    
    
    
    def sequence_vectorize(train_texts, val_texts):
        """Vectorizes texts as sequence vectors.
        1 text = 1 sequence vector with fixed length.
        # Arguments
            train_texts: list, training text strings.
            val_texts: list, validation text strings.
        # Returns
            x_train, x_val, word_index: vectorized training and validation
                texts and word index dictionary.
        """
        # Create vocabulary with training texts.
        tokenizer = text.Tokenizer(num_words=TOP_K)
        tokenizer.fit_on_texts(train_texts)
    
        # Vectorize training and validation texts.
        x_train = tokenizer.texts_to_sequences(train_texts)
        x_val = tokenizer.texts_to_sequences(val_texts)
    
        # Get max sequence length.
        max_length = len(max(x_train, key=len))
        if max_length > MAX_SEQUENCE_LENGTH:
            max_length = MAX_SEQUENCE_LENGTH
    
        # Fix sequence length to max value. Sequences shorter than the length are
        # padded in the beginning and sequences longer are truncated
        # at the beginning.
        x_train = sequence.pad_sequences(x_train, maxlen=max_length)
        x_val = sequence.pad_sequences(x_val, maxlen=max_length)
    
    
    
        # saving
        with open('tokenizer.pickle', 'wb') as handle:
          pickle.dump(tokenizer, handle, protocol=pickle.HIGHEST_PROTOCOL)
    
        return x_train, x_val, tokenizer.word_index
    
    
    
    def load_imdb_sentiment_analysis_dataset(data_path, seed=123):
        """Loads the Imdb movie reviews sentiment analysis dataset.
        # Arguments
            data_path: string, path to the data directory.
            seed: int, seed for randomizer.
        # Returns
            A tuple of training and validation data.
            Number of training samples: 25000
            Number of test samples: 25000
            Number of categories: 2 (0 - negative, 1 - positive)
        # References
            Mass et al., http://www.aclweb.org/anthology/P11-1015
            Download and uncompress archive from:
            http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
        """
        imdb_data_path = os.path.join(data_path, 'aclImdb')
    
        # Load the training data
        train_texts = []
        train_labels = []
        for category in ['pos', 'neg']:
            train_path = os.path.join(imdb_data_path, 'train', category)
            for fname in sorted(os.listdir(train_path)):
                if fname.endswith('.txt'):
                    with open(os.path.join(train_path, fname)) as f:
                        train_texts.append(f.read())
                    train_labels.append(0 if category == 'neg' else 1)
    
        # Load the validation data.
        test_texts = []
        test_labels = []
        for category in ['pos', 'neg']:
            test_path = os.path.join(imdb_data_path, 'test', category)
            for fname in sorted(os.listdir(test_path)):
                if fname.endswith('.txt'):
                    with open(os.path.join(test_path, fname)) as f:
                        test_texts.append(f.read())
                    test_labels.append(0 if category == 'neg' else 1)
    
        # Shuffle the training data and labels.
        random.seed(seed)
        random.shuffle(train_texts)
        random.seed(seed)
        random.shuffle(train_labels)
    
        return ((train_texts, np.array(train_labels)),
                (test_texts, np.array(test_labels)))
        
    
         
    
  • Load dataset


  • 
    !wget http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz -O imdb.tar.gz
    import tarfile
    with tarfile.open('imdb.tar.gz', 'r:gz') as tar:
        tar.extractall()
    
    (x_train, y_train),(x_test, y_test) =  load_imdb_sentiment_analysis_dataset("")
    
    from sklearn.model_selection import train_test_split
    seed = 7
    test_size = 0.25
    X_train, X_val, y_train, y_val = train_test_split(x_train, y_train, test_size=test_size, random_state=seed)
    
         
    
  • Vectorize texts.


  • 
    x_train, x_val, word_index = sequence_vectorize(X_train, X_val)
    
         
    
  • Number of features will be the embedding input dimension. Add 1 for the reserved index 0.


  • 
    num_features = min(len(word_index) + 1, TOP_K)
         
    
  • Download embedding matrix: Download and extract GloVe embeddings


  • 
    !wget http://nlp.stanford.edu/data/glove.6B.zip
    !unzip glove.6B.zip
    
    def _get_last_layer_units_and_activation(num_classes):
        """Gets the # units and activation function for the last network layer.
        # Arguments
            num_classes: int, number of classes.
        # Returns
            units, activation values.
        """
        if num_classes == 2:
            activation = 'sigmoid'
            units = 1
        else:
            activation = 'softmax'
            units = num_classes
        return units, activation
    
    
    
    def get_num_classes(labels):
        """Gets the total number of classes.
        # Arguments
            labels: list, label values.
                There should be at lease one sample for values in the
                range (0, num_classes -1)
        # Returns
            int, total number of classes.
        # Raises
            ValueError: if any label value in the range(0, num_classes - 1)
                is missing or if number of classes is <= 1.
        """
        num_classes = max(labels) + 1
        missing_classes = [i for i in range(num_classes) if i not in labels]
        if len(missing_classes):
            raise ValueError('Missing samples with label value(s) '
                             '{missing_classes}. Please make sure you have '
                             'at least one sample for every label value '
                             'in the range(0, {max_class})'.format(
                                missing_classes=missing_classes,
                                max_class=num_classes - 1))
    
        if num_classes <= 1:
            raise ValueError('Invalid number of labels: {num_classes}.'
                             'Please make sure there are at least two classes '
                             'of samples'.format(num_classes=num_classes))
        return num_classes
    
    
    
    def _get_embedding_matrix(word_index, embedding_data_dir, embedding_dim):
        """Gets embedding matrix from the embedding index data.
        # Arguments
            word_index: dict, word to index map that was generated from the data.
            embedding_data_dir: string, path to the pre-training embeddings.
            embedding_dim: int, dimension of the embedding vectors.
        # Returns
            dict, word vectors for words in word_index from pre-trained embedding.
        # References:
            https://nlp.stanford.edu/projects/glove/
            Download and uncompress archive from:
            http://nlp.stanford.edu/data/glove.6B.zip
        """
    
        # Read the pre-trained embedding file and get word to word vector mappings.
        embedding_matrix_all = {}
    
        # We are using 200d GloVe embeddings.
        fname = os.path.join(embedding_data_dir, 'glove.6B.200d.txt')
        with open(fname) as f:
            for line in f:  # Every line contains word followed by the vector value
                values = line.split()
                word = values[0]
                coefs = np.asarray(values[1:], dtype='float32')
                embedding_matrix_all[word] = coefs
    
        # Prepare embedding matrix with just the words in our word_index dictionary
        num_words = min(len(word_index) + 1, TOP_K)
        embedding_matrix = np.zeros((num_words, embedding_dim))
    
        for word, i in word_index.items():
            if i >= TOP_K:
                continue
            embedding_vector = embedding_matrix_all.get(word)
            if embedding_vector is not None:
                # words not found in embedding index will be all-zeros.
                embedding_matrix[i] = embedding_vector
        return embedding_matrix
    
    embedding_dim=200
    embedding_matrix = _get_embedding_matrix(word_index, "", embedding_dim)
    
         
    
  • Use pretrained embedding or not


  • 
    use_pretrained_embedding=True
    is_embedding_trainable = False
    
         
    
  • Build model


  • 
    from __future__ import absolute_import
    from __future__ import division
    from __future__ import print_function
    import argparse
    import time
    import tensorflow as tf
    import numpy as np
    
    
    FLAGS = None
    
    # Limit on the number of features. We use the top 20K features.
    TOP_K = 20000
    
    
    
    learning_rate=1e-3
    epochs=1000
    batch_size=128
    blocks=2
    filters=64
    dropout_rate=0.2
    
    kernel_size=3
    pool_size=3
     
    num_classes = get_num_classes(y_train)
    # Create model instance.
    
    
    op_units, op_activation = _get_last_layer_units_and_activation(num_classes)
    model = models.Sequential()
    
    # Add embedding layer. If pre-trained embedding is used add weights to the
    # embeddings layer and set trainable to input is_embedding_trainable flag.
    if use_pretrained_embedding:
            model.add(Embedding(input_dim=num_features,
                                output_dim=embedding_dim,
                                input_length=x_train.shape[1],
                                weights=[embedding_matrix],
                                trainable=is_embedding_trainable))
    else:
            model.add(Embedding(input_dim=num_features,
                                output_dim=200,
                                input_length=x_train.shape[1]))
    
    for _ in range(blocks-1):
            model.add(Dropout(rate=dropout_rate))
            model.add(SeparableConv1D(filters=filters,
                                      kernel_size=kernel_size,
                                      activation='relu',
                                      bias_initializer='random_uniform',
                                      depthwise_initializer='random_uniform',
                                      padding='same'))
            model.add(SeparableConv1D(filters=filters,
                                      kernel_size=kernel_size,
                                      activation='relu',
                                      bias_initializer='random_uniform',
                                      depthwise_initializer='random_uniform',
                                      padding='same'))
            model.add(MaxPooling1D(pool_size=pool_size))
    
    model.add(SeparableConv1D(filters=filters * 2,
                                  kernel_size=kernel_size,
                                  activation='relu',
                                  bias_initializer='random_uniform',
                                  depthwise_initializer='random_uniform',
                                  padding='same'))
    model.add(SeparableConv1D(filters=filters * 2,
                                  kernel_size=kernel_size,
                                  activation='relu',
                                  bias_initializer='random_uniform',
                                  depthwise_initializer='random_uniform',
                                  padding='same'))
    model.add(GlobalAveragePooling1D())
    model.add(Dropout(rate=dropout_rate))
    model.add(Dense(op_units, activation=op_activation)) 
    
         
    
  • Compile model with learning parameters: Create callback for early stopping on validation loss. If the loss does not decrease in two consecutive tries, stop training.


  • 
    if num_classes == 2:
            loss = 'binary_crossentropy'
    else:
            loss = 'sparse_categorical_crossentropy'
    optimizer = tf.keras.optimizers.Adam(lr=learning_rate)
    model.compile(optimizer=optimizer, loss=loss, metrics=['acc'])
    
    callbacks = [tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=2)]
    
         
    
  • Train and validate model.


  • 
    history = model.fit(
                x_train,
                y_train,
                epochs=epochs,
                callbacks=callbacks,
                validation_data=(x_val, y_val),
                verbose=2,  # Logs once per epoch.
                batch_size=batch_size)
    
         
    
  • Save model


  • 
    model.save("my_model")
         
    
  • app.py


  • 
    import numpy as np
    from flask import Flask, request, jsonify, render_template
    import pickle
    from tensorflow.python.keras import models
    from tensorflow.python.keras import initializers
    from tensorflow.python.keras import regularizers
    import os
    import pickle
    from tensorflow.python.keras.layers import Dense
    from tensorflow.python.keras.layers import Dropout
    from tensorflow.python.keras.layers import Embedding
    from tensorflow.python.keras.layers import SeparableConv1D
    from tensorflow.python.keras.layers import MaxPooling1D
    from tensorflow.python.keras.layers import GlobalAveragePooling1D
    import tensorflow as tf
    import numpy as np
    import random
    import pandas as pd
    from tensorflow.python.keras.preprocessing import sequence
    from tensorflow.python.keras.preprocessing import text
    from sklearn.feature_extraction.text import TfidfVectorizer
    from sklearn.feature_selection import SelectKBest
    from sklearn.feature_selection import f_classif
    
    
    app = Flask(__name__)
    run_with_ngrok(app)   #starts ngrok when the app is run
    
    
    
    def load_model():
        global tokenizers
        tokenizers = pickle.load(open('/content/tokenizer.pickle', 'rb'))
        global reconstructed_model
        reconstructed_model = models.load_model("my_model")
    
    @app.route('/')
    def home():
        return render_template('index.html')
    
    
    def preprocess(train_texts):
        tokenizers.fit_on_texts(train_texts)
        x = tokenizers.texts_to_sequences(train_texts)    
        x = sequence.pad_sequences(x, maxlen=MAX_SEQUENCE_LENGTH)  
        return x
    
    
    
    @app.route('/predict',methods=['POST','GET'])
    def predict():
    
        data = request.form['comment']
        features = [data]
        final_features = preprocess(features)
        prediction = reconstructed_model.predict(final_features)
    
        if prediction <= 0.5:
          output = "Negative"
        else:
          output = "Positive"
    
        return render_template('index.html', label='Sentiment is $ {}'.format(output))
    
    
    
    if __name__ == "__main__":
        load_model()
        app.run()