• The TensorFlow tutorials are written as Jupyter notebooks and run directly in Google Colab—a hosted notebook environment that requires no setup. Click the Run in Google Colab button.


  • Colab link - Open colab


  • Load NumPy data: This tutorial provides an example of loading data from NumPy arrays into a `tf.data.Dataset`.


  • This example loads the MNIST dataset from a `.npz` file. However, the source of the NumPy arrays is not important.


  • Setup


  • 
    
    import numpy as np
    import tensorflow as tf
     
    
  • Load from `.npz` file


  • 
    
    DATA_URL = 'https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz'
    
    path = tf.keras.utils.get_file('mnist.npz', DATA_URL)
    with np.load(path) as data:
      train_examples = data['x_train']
      train_labels = data['y_train']
      test_examples = data['x_test']
      test_labels = data['y_test']
     
    
  • Load NumPy arrays with `tf.data.Dataset`


  • Assuming you have an array of examples and a corresponding array of labels, pass the two arrays as a tuple into `tf.data.Dataset.from_tensor_slices` to create a `tf.data.Dataset`.


  • 
    
    train_dataset = tf.data.Dataset.from_tensor_slices((train_examples, train_labels))
    test_dataset = tf.data.Dataset.from_tensor_slices((test_examples, test_labels))
     
    
  • Use the datasets - Shuffle and batch the datasets


  • 
    
    BATCH_SIZE = 64
    SHUFFLE_BUFFER_SIZE = 100
    
    train_dataset = train_dataset.shuffle(SHUFFLE_BUFFER_SIZE).batch(BATCH_SIZE)
    test_dataset = test_dataset.batch(BATCH_SIZE)
     
    
  • Build and train a model


  • 
    
    model = tf.keras.Sequential([
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(128, activation='relu'),
        tf.keras.layers.Dense(10)
    ])
    
    model.compile(optimizer=tf.keras.optimizers.RMSprop(),
                  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                  metrics=['sparse_categorical_accuracy'])
    
    model.fit(train_dataset, epochs=10)
    
    model.evaluate(test_dataset)