• The TensorFlow tutorials are written as Jupyter notebooks and run directly in Google Colab—a hosted notebook environment that requires no setup. Click the Run in Google Colab button.


  • Colab link - Open colab


  • ## Overview The Keras Tuner is a library that helps you pick the optimal set of hyperparameters for your TensorFlow program. The process of selecting the right set of hyperparameters for your machine learning (ML) application is called *hyperparameter tuning* or *hypertuning*.


  • Hyperparameters are the variables that govern the training process and the topology of an ML model. These variables remain constant over the training process and directly impact the performance of your ML program. Hyperparameters are of two types:


  • 1. **Model hyperparameters** which influence model selection such as the number and width of hidden layers


  • 2. **Algorithm hyperparameters** which influence the speed and quality of the learning algorithm such as the learning rate for Stochastic Gradient Descent (SGD) and the number of nearest neighbors for a k Nearest Neighbors (KNN) classifier


  • In this tutorial, you will use the Keras Tuner to perform hypertuning for an image classification application.


  • 
    ## Setup
    
    import tensorflow as tf
    from tensorflow import keras
    
    import IPython
    
    Install and import the Keras Tuner.
    
    !pip install -U keras-tuner
    import kerastuner as kt
     
    
  • ## Download and prepare the dataset In this tutorial, you will use the Keras Tuner to find the best hyperparameters for a machine learning model that classifies images of clothing from the [Fashion MNIST dataset]


  • Load the data.


  • 
    (img_train, label_train), (img_test, label_test) = keras.datasets.fashion_mnist.load_data()
    
    # Normalize pixel values between 0 and 1
    img_train = img_train.astype('float32') / 255.0
    img_test = img_test.astype('float32') / 255.0
     
    
  • ## Define the model When you build a model for hypertuning, you also define the hyperparameter search space in addition to the model architecture. The model you set up for hypertuning is called a *hypermodel*.


  • You can define a hypermodel through two approaches: * By using a model builder function * By subclassing the `HyperModel` class of the Keras Tuner API


  • You can also use two pre-defined `HyperModel` classes - [HyperXception] and [HyperResNet] for computer vision applications.


  • In this tutorial, you use a model builder function to define the image classification model. The model builder function returns a compiled model and uses hyperparameters you define inline to hypertune the model.


  • 
    def model_builder(hp):
      model = keras.Sequential()
      model.add(keras.layers.Flatten(input_shape=(28, 28)))
      
      # Tune the number of units in the first Dense layer
      # Choose an optimal value between 32-512
      hp_units = hp.Int('units', min_value = 32, max_value = 512, step = 32)
      model.add(keras.layers.Dense(units = hp_units, activation = 'relu'))
      model.add(keras.layers.Dense(10))
    
      # Tune the learning rate for the optimizer 
      # Choose an optimal value from 0.01, 0.001, or 0.0001
      hp_learning_rate = hp.Choice('learning_rate', values = [1e-2, 1e-3, 1e-4]) 
      
      model.compile(optimizer = keras.optimizers.Adam(learning_rate = hp_learning_rate),
                    loss = keras.losses.SparseCategoricalCrossentropy(from_logits = True), 
                    metrics = ['accuracy'])
      
      return model
     
    
  • ## Instantiate the tuner and perform hypertuning Instantiate the tuner to perform the hypertuning. The Keras Tuner has four tuners available - `RandomSearch`, `Hyperband`, `BayesianOptimization`, and `Sklearn`. In this tutorial, you use the [Hyperband] tuner.


  • To instantiate the Hyperband tuner, you must specify the hypermodel, the `objective` to optimize and the maximum number of epochs to train (`max_epochs`).


  • 
    tuner = kt.Hyperband(model_builder,
                         objective = 'val_accuracy', 
                         max_epochs = 10,
                         factor = 3,
                         directory = 'my_dir',
                         project_name = 'intro_to_kt')                       
     
    
  • The Hyperband tuning algorithm uses adaptive resource allocation and early-stopping to quickly converge on a high-performing model. This is done using a sports championship style bracket.


  • The algorithm trains a large number of models for a few epochs and carries forward only the top-performing half of models to the next round.


  • Hyperband determines the number of models to train in a bracket by computing 1 + log`factor`(`max_epochs`) and rounding it up to the nearest integer.


  • Before running the hyperparameter search, define a callback to clear the training outputs at the end of every training step.


  • 
    class ClearTrainingOutput(tf.keras.callbacks.Callback):
      def on_train_end(*args, **kwargs):
        IPython.display.clear_output(wait = True)
     
    
  • Run the hyperparameter search. The arguments for the search method are the same as those used for `tf.keras.model.fit` in addition to the callback above.


  • 
    tuner.search(img_train, label_train, epochs = 10, validation_data = (img_test, label_test), callbacks = [ClearTrainingOutput()])
    
    # Get the optimal hyperparameters
    best_hps = tuner.get_best_hyperparameters(num_trials = 1)[0]
    
    print(f"""
    The hyperparameter search is complete. The optimal number of units in the first densely-connected
    layer is {best_hps.get('units')} and the optimal learning rate for the optimizer
    is {best_hps.get('learning_rate')}.
    """)
     
    
  • To finish this tutorial, retrain the model with the optimal hyperparameters from the search.


  • 
    # Build the model with the optimal hyperparameters and train it on the data
    model = tuner.hypermodel.build(best_hps)
    model.fit(img_train, label_train, epochs = 10, validation_data = (img_test, label_test))
     
    
  • The `my_dir/intro_to_kt` directory contains detailed logs and checkpoints for every trial (model configuration) run during the hyperparameter search. If you re-run the hyperparameter search, the Keras Tuner uses the existing state from these logs to resume the search. To disable this behavior, pass an additional `overwrite = True` argument while instantiating the tuner.


  • ## Summary In this tutorial, you learned how to use the Keras Tuner to tune hyperparameters for a model. To learn more about the Keras Tuner, check out these additional resources:


  • * [Keras Tuner on the TensorFlow blog] * [Keras Tuner website] Also check out the [HParams Dashboard] in TensorBoard to interactively tune your model hyperparameters.


  • Perform hyperparameter tuning for a single-layer dense neural network using random search.


  • First, we define a model-building function. It takes an argument hp from which you can sample hyperparameters, such as hp.Int('units', min_value=32, max_value=512, step=32) (an integer from a certain range).


  • This function returns a compiled model.


  • 
    from tensorflow import keras
    from tensorflow.keras import layers
    from kerastuner.tuners import RandomSearch
    
    
    def build_model(hp):
        model = keras.Sequential()
        model.add(layers.Dense(units=hp.Int('units',
                                            min_value=32,
                                            max_value=512,
                                            step=32),
                               activation='relu'))
        model.add(layers.Dense(10, activation='softmax'))
        model.compile(
            optimizer=keras.optimizers.Adam(
                hp.Choice('learning_rate',
                          values=[1e-2, 1e-3, 1e-4])),
            loss='sparse_categorical_crossentropy',
            metrics=['accuracy'])
        return model    
     
    
  • Next, instantiate a tuner. You should specify the model-building function, the name of the objective to optimize (whether to minimize or maximize is automatically inferred for built-in metrics), the total number of trials (max_trials) to test, and the number of models that should be built and fit for each trial (executions_per_trial).


  • Available tuners are RandomSearch and Hyperband.


  • Note: the purpose of having multiple executions per trial is to reduce results variance and therefore be able to more accurately assess the performance of a model. If you want to get results faster, you could set executions_per_trial=1 (single round of training for each model configuration).


  • 
    
    tuner = RandomSearch(
        build_model,
        objective='val_accuracy',
        max_trials=5,
        executions_per_trial=3,
        directory='my_dir',
        project_name='helloworld')
     
    
  • You can print a summary of the search space:


  • 
    tuner.search_space_summary()
     
    
  • Then, start the search for the best hyperparameter configuration. The call to search has the same signature as model.fit().


  • 
    tuner.search(x, y,
                 epochs=5,
                 validation_data=(val_x, val_y))
     
    
  • Here's what happens in search: models are built iteratively by calling the model-building function, which populates the hyperparameter space (search space) tracked by the hp object.


  • The tuner progressively explores the space, recording metrics for each configuration.


  • When search is over, you can retrieve the best model(s):


  • 
    models = tuner.get_best_models(num_models=2)
     
    
  • Or print a summary of the results:


  • 
    tuner.results_summary()
     
    
  • You will also find detailed logs, checkpoints, etc, in the folder my_dir/helloworld, i.e. directory/project_name.


  • The search space may contain conditional hyperparameters


  • Below, we have a for loop creating a tunable number of layers, which themselves involve a tunable units parameter.


  • This can be pushed to any level of parameter interdependency, including recursion.


  • Note that all parameter names should be unique (here, in the loop over i, we name the inner parameters 'units_' + str(i)).


  • 
    def build_model(hp):
        model = keras.Sequential()
        for i in range(hp.Int('num_layers', 2, 20)):
            model.add(layers.Dense(units=hp.Int('units_' + str(i),
                                                min_value=32,
                                                max_value=512,
                                                step=32),
                                   activation='relu'))
        model.add(layers.Dense(10, activation='softmax'))
        model.compile(
            optimizer=keras.optimizers.Adam(
                hp.Choice('learning_rate', [1e-2, 1e-3, 1e-4])),
            loss='sparse_categorical_crossentropy',
            metrics=['accuracy'])
        return model
     
    
  • You can use a HyperModel subclass instead of a model-building function


  • This makes it easy to share and reuse hypermodels.


  • A HyperModel subclass only needs to implement a build(self, hp) method.


  • 
    from kerastuner import HyperModel
    
    
    class MyHyperModel(HyperModel):
    
        def __init__(self, num_classes):
            self.num_classes = num_classes
    
        def build(self, hp):
            model = keras.Sequential()
            model.add(layers.Dense(units=hp.Int('units',
                                                min_value=32,
                                                max_value=512,
                                                step=32),
                                   activation='relu'))
            model.add(layers.Dense(self.num_classes, activation='softmax'))
            model.compile(
                optimizer=keras.optimizers.Adam(
                    hp.Choice('learning_rate',
                              values=[1e-2, 1e-3, 1e-4])),
                loss='sparse_categorical_crossentropy',
                metrics=['accuracy'])
            return model
    
    
    hypermodel = MyHyperModel(num_classes=10)
    
    tuner = RandomSearch(
        hypermodel,
        objective='val_accuracy',
        max_trials=10,
        directory='my_dir',
        project_name='helloworld')
    
    tuner.search(x, y,
                 epochs=5,
                 validation_data=(val_x, val_y))
     
    
  • Keras Tuner includes pre-made tunable applications: HyperResNet and HyperXception


  • These are ready-to-use hypermodels for computer vision.


  • They come pre-compiled with loss="categorical_crossentropy" and metrics=["accuracy"].


  • 
    from kerastuner.applications import HyperResNet
    from kerastuner.tuners import Hyperband
    
    hypermodel = HyperResNet(input_shape=(128, 128, 3), num_classes=10)
    
    tuner = Hyperband(
        hypermodel,
        objective='val_accuracy',
        max_trials=40,
        directory='my_dir',
        project_name='helloworld')
    
    tuner.search(x, y,
                 epochs=20,
                 validation_data=(val_x, val_y))
     
    
  • You can easily restrict the search space to just a few parameters


  • If you have an existing hypermodel, and you want to search over only a few parameters (such as the learning rate), you can do so by passing a hyperparameters argument to the tuner constructor, as well as tune_new_entries=False to specify that parameters that you didn't list in hyperparameters should not be tuned. For these parameters, the default value gets used.


  • 
    from kerastuner import HyperParameters
    
    hypermodel = HyperXception(input_shape=(128, 128, 3), num_classes=10)
    
    hp = HyperParameters()
    # This will override the `learning_rate` parameter with your
    # own selection of choices
    hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])
    
    tuner = Hyperband(
        hypermodel,
        hyperparameters=hp,
        # `tune_new_entries=False` prevents unlisted parameters from being tuned
        tune_new_entries=False,
        objective='val_accuracy',
        max_trials=40,
        directory='my_dir',
        project_name='helloworld')
    
    tuner.search(x, y,
                 epochs=20,
                 validation_data=(val_x, val_y))
     
    
  • Whenever you register a hyperparameter inside a model-building function or the build method of a hypermodel, you can specify a default value:


  • 
    hp.Int('units',
           min_value=32,
           max_value=512,
           step=32,
           default=128)
     
    
  • If you don't, hyperparameters always have a default default (for Int, it is equal to min_value).


  • Fixing values in a hypermodel What if you want to do the reverse -- tune all available parameters in a hypermodel, except one (the learning rate)?


  • Pass a hyperparameters argument with a Fixed entry (or any number of Fixed entries), and specify tune_new_entries=True.


  • 
    hypermodel = HyperXception(input_shape=(128, 128, 3), num_classes=10)
    
    hp = HyperParameters()
    hp.Fixed('learning_rate', value=1e-4)
    
    tuner = Hyperband(
        hypermodel,
        hyperparameters=hp,
        tune_new_entries=True,
        objective='val_accuracy',
        max_trials=40,
        directory='my_dir',
        project_name='helloworld')
    
    tuner.search(x, y,
                 epochs=20,
                 validation_data=(val_x, val_y))
     
    
  • Overriding compilation arguments If you have a hypermodel for which you want to change the existing optimizer, loss, or metrics, you can do so by passing these arguments to the tuner constructor:


  • 
    hypermodel = HyperXception(input_shape=(128, 128, 3), num_classes=10)
    
    tuner = Hyperband(
        hypermodel,
        optimizer=keras.optimizers.Adam(1e-3),
        loss='mse',
        metrics=[keras.metrics.Precision(name='precision'),
                 keras.metrics.Recall(name='recall')],
        objective='val_precision',
        max_trials=40,
        directory='my_dir',
        project_name='helloworld')
    
    tuner.search(x, y,
                 epochs=20,
                 validation_data=(val_x, val_y))