{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Notebook 12: Identifying Phases in the 2D Ising Model with TensorFlow\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Learning Goal\n", "The goal of this notebook is to familiarize the reader with the nuts and bolts on using the TensorFlow package for building Deep Neural Networks.\n", "\n", "## Overview\n", "\n", "In this notebook, we show how one can use deep neural nets to classify the states of the 2D Ising model according to their phase. This should be compared with the use of logistic-regression, Random Forests and XG Boost on the same dataset in the previous Notebooks 6 and 9.\n", "\n", "The Hamiltonian for the classical Ising model is given by\n", "\n", "$$ H = -J\\sum_{\\langle ij\\rangle}S_{i}S_j,\\qquad \\qquad S_j\\in\\{\\pm 1\\} $$\n", "\n", "where the lattice site indices $i,j$ run over all nearest neighbors of a 2D square lattice, and $J$ is some arbitrary interaction energy scale. We adopt periodic boundary conditions. Onsager proved that this model undergoes a phase transition in the thermodynamic limit from an ordered ferromagnet with all spins aligned to a disordered phase at the critical temperature $T_c/J=2/\\log(1+\\sqrt{2})\\approx 2.26$. For any finite system size, this critical point is expanded to a critical region around $T_c$." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# -*- coding: utf-8 -*-\n", "from __future__ import absolute_import, division, print_function\n", "import numpy as np\n", "seed=12\n", "np.random.seed(seed)\n", "import sys, os, argparse\n", "import tensorflow as tf\n", "from tensorflow.python.framework import dtypes\n", "# suppress tflow compilation warnings\n", "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'\n", "\n", "tf.set_random_seed(seed)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Structure of the Procedure\n", "\n", "Constructing a Deep Neural Network to solve ML problems is a multiple-stage process. Quite generally, one can identify the key steps as follows:\n", "\n", "* ***step 1:*** Load and process the data\n", "* ***step 2:*** Define the model and its architecture\n", "* ***step 3:*** Choose the optimizer and the cost function\n", "* ***step 4:*** Train the model \n", "* ***step 5:*** Evaluate the model performance on the *unseen* test data\n", "* ***step 6:*** Modify the hyperparameters to optimise performance for the specific data set\n", "\n", "Below, we sometimes combine some of these steps together for convenience.\n", "\n", "Notice that we take a rather different approach, compared to the simpler MNIST Keras notebook. We first define a set of classes and functions and run the actual computation only in the very end." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Step 1: Load and Process the Data\n", "\n", "We begin by writing a `DataSet` class and two functions `read_data_sets` and `load_data` to process the 2D Ising data. \n", "\n", "The `DataSet` class performs checks on the data shape and casts the data into the correct data type for the calculation. It contains a function method called `next_batch` which shuffles the data and returns a mini-batch of a pre-defined size. This structure is particularly useful for the training procedure in TensorFlow." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "class DataSet(object):\n", "\n", " def __init__(self, data_X, data_Y, dtype=dtypes.float32):\n", " \"\"\"Checks data and casts it into correct data type. \"\"\"\n", "\n", " dtype = dtypes.as_dtype(dtype).base_dtype\n", " if dtype not in (dtypes.uint8, dtypes.float32):\n", " raise TypeError('Invalid dtype %r, expected uint8 or float32' % dtype)\n", "\n", " assert data_X.shape[0] == data_Y.shape[0], ('data_X.shape: %s data_Y.shape: %s' % (data_X.shape, data_Y.shape))\n", " self.num_examples = data_X.shape[0]\n", "\n", " if dtype == dtypes.float32:\n", " data_X = data_X.astype(np.float32)\n", " self.data_X = data_X\n", " self.data_Y = data_Y \n", "\n", " self.epochs_completed = 0\n", " self.index_in_epoch = 0\n", "\n", " def next_batch(self, batch_size, seed=None):\n", " \"\"\"Return the next `batch_size` examples from this data set.\"\"\"\n", "\n", " if seed:\n", " np.random.seed(seed)\n", "\n", " start = self.index_in_epoch\n", " self.index_in_epoch += batch_size\n", " if self.index_in_epoch > self.num_examples:\n", " # Finished epoch\n", " self.epochs_completed += 1\n", " # Shuffle the data\n", " perm = np.arange(self.num_examples)\n", " np.random.shuffle(perm)\n", " self.data_X = self.data_X[perm]\n", " self.data_Y = self.data_Y[perm]\n", " # Start next epoch\n", " start = 0\n", " self.index_in_epoch = batch_size\n", " assert batch_size <= self.num_examples\n", " end = self.index_in_epoch\n", "\n", " return self.data_X[start:end], self.data_Y[start:end]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The data itself are being processed in the function `read_data_sets`, which loads the Ising dataset, and splits it into three subsets: ordered, critical and disordered, depending on the temperature which sets the distribution they are drawn from. Once again, we use the ordered and disordered data to create a training and a test data set for the problem. Classifying the states in the critical region is expected to be harder and we only use this data to test the performance of our model in the end." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import pickle, os\n", "os.environ['KMP_DUPLICATE_LIB_OK']='True'\n", "from urllib.request import urlopen \n", "\n", "def load_data():\n", "\n", " # path to data directory (for testing)\n", " #path_to_data=os.path.expanduser('~')+'/Dropbox/MachineLearningReview/Datasets/isingMC/'\n", "\n", " url_main = 'https://physics.bu.edu/~pankajm/ML-Review-Datasets/isingMC/';\n", "\n", " ######### LOAD DATA\n", " # The data consists of 16*10000 samples taken in T=np.arange(0.25,4.0001,0.25):\n", " data_file_name = \"Ising2DFM_reSample_L40_T=All.pkl\" \n", " # The labels are obtained from the following file:\n", " label_file_name = \"Ising2DFM_reSample_L40_T=All_labels.pkl\"\n", "\n", " #DATA\n", " data = pickle.load(urlopen(url_main + data_file_name)) # pickle reads the file and returns the Python object (1D array, compressed bits)\n", " data = np.unpackbits(data).reshape(-1, 1600) # Decompress array and reshape for convenience\n", " data=data.astype('int')\n", " data[np.where(data==0)]=-1 # map 0 state to -1 (Ising variable can take values +/-1)\n", "\n", " #LABELS (convention is 1 for ordered states and 0 for disordered states)\n", " labels = pickle.load(urlopen(url_main + label_file_name)) # pickle reads the file and returns the Python object (here just a 1D array with the binary labels)\n", " \n", " print(\"Finished loading data\")\n", " return data, labels\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Using TensorFlow backend.\n" ] } ], "source": [ "import pickle\n", "from sklearn.model_selection import train_test_split\n", "from keras.utils import to_categorical\n", "\n", "def prepare_data(data, labels, dtype=dtypes.float32, test_size=0.2, validation_size=5000):\n", " \n", " L=40 # linear system size\n", "\n", " # divide data into ordered, critical and disordered\n", " X_ordered=data[:70000,:]\n", " Y_ordered=labels[:70000]\n", "\n", " X_critical=data[70000:100000,:]\n", " Y_critical=labels[70000:100000]\n", "\n", " X_disordered=data[100000:,:]\n", " Y_disordered=labels[100000:]\n", "\n", " # define training and test data sets\n", " X=np.concatenate((X_ordered,X_disordered)) #np.concatenate((X_ordered,X_critical,X_disordered))\n", " Y=np.concatenate((Y_ordered,Y_disordered)) #np.concatenate((Y_ordered,Y_critical,Y_disordered))\n", "\n", " # pick random data points from ordered and disordered states to create the training and test sets\n", " X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=test_size, train_size=1.0-test_size)\n", "\n", " # make data categorical (i.e [0,1] or [1,0])\n", " Y_train=to_categorical(Y_train)\n", " Y_test=to_categorical(Y_test)\n", " Y_critical=to_categorical(Y_critical)\n", "\n", "\n", " if not 0 <= validation_size <= len(X_train):\n", " raise ValueError('Validation size should be between 0 and {}. Received: {}.'.format(len(X_train), validation_size))\n", "\n", " X_validation = X_train[:validation_size]\n", " Y_validation = Y_train[:validation_size]\n", " X_train = X_train[validation_size:]\n", " Y_train = Y_train[validation_size:]\n", "\n", " # create data sets\n", " dataset = {\n", " 'train':DataSet(X_train, Y_train),\n", " 'test':DataSet(X_test, Y_test),\n", " 'critical':DataSet(X_critical, Y_critical),\n", " 'validation':DataSet(X_validation, Y_validation)\n", " }\n", "\n", " return dataset\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `DataSet` class and the `read_data_sets` function are wrapped in another function: `load_data`. To call the latter, one specifies the sizes for the training, test and validation data sets. This function also contains the local path to the file with the Ising data. " ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "def prepare_Ising_DNN():\n", " data, labels = load_data()\n", " return prepare_data(data, labels, test_size=0.2, validation_size=5000)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Steps 2+3: Define the Neural Net and its Architecture, Choose the Optimizer and the Cost Function\n", "\n", "We can now move on to construct our deep neural net using TensorFlow. To do this, we create a class called `model`. This class contains many useful function methods which break down the construction of the DNN. Unique for TensorFlow is creating placeholders for the variables of the model, such as the feed-in data `self.X` and `self.Y` or the dropout probability `self.dropout_keepprob` (which has to be set to unity explicitly during testing). Another peculiarity is using the `with` scope to give names to the most important operators. While we do not discuss this here, TensorFlow also allows one to visualise the computational graph for the model (see package documentation on [https://www.tensorflow.org/](https://www.tensorflow.org/)).\n", "\n", "To classify whether a given spin configuration is in the ordered or disordered phase, we construct a minimalistic model for a DNN with a single hidden layer containing $N_\\mathrm{neurons}$ (which is kept variable so we can try out the performance of different sizes for the hidden layer). \n", "\n", "First, we define two private functions: `_weight_variable` and `_bias_variable`, which we use to set up the precise DNN architecture in the function `create_DNN`. The network architecture thus includes a ReLU-activated input layer, the hidden layer, and the softmax output layer. Notice that the softmax layer is _not_ part of the `create_DNN` function.\n", "\n", "Instead, the softmax layer is part of the function `create_loss` which, as the name suggests, defines the cross entropy loss function, predefined in TensorFlow's `nn` module. We minimize the cost function using the `SGD` optimizer from the `train` module in the function `create_optimiser`. The latter accepts a dictionary `opt_kwargs` with optimizer arguments to be set externally when defining the DNN.\n", "\n", "Last, the function `create_accuracy` evaluates the model performance.\n", "\n", "All these function are called in the `__init__` of our `model` class which sets up the DNN. It accepts the number of hidden neurons $N_\\mathrm{neurons}$ and a dictionary with the optimizer arguments as input, as we shall study the performance of the DNN as a function of these parameters." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "class model(object):\n", " def __init__(self, N_neurons, opt_kwargs):\n", " \"\"\"Builds the TFlow graph for the DNN.\n", "\n", " N_neurons: number of neurons in the hidden layer\n", " opt_kwargs: optimizer's arguments\n", "\n", " \"\"\" \n", "\n", " # define global step for checkpointing\n", " self.global_step=tf.Variable(0, dtype=tf.int32, trainable=False, name='global_step')\n", "\n", " self.L=40 # system linear size\n", " self.n_feats=self.L**2 # 40x40 square lattice\n", " self.n_categories=2 # 2 Ising phases: ordered and disordered\n", "\n", "\n", " # create placeholders for input X and label Y\n", " self.create_placeholders()\n", " # create weight and bias, initialized to 0 and construct DNN to predict Y from X\n", " self.deep_layer_neurons=N_neurons\n", " self.create_DNN()\n", " # define loss function\n", " self.create_loss()\n", " # use gradient descent to minimize loss\n", " self.create_optimiser(opt_kwargs)\n", " # create accuracy\n", " self.create_accuracy()\n", "\n", "\n", " def create_placeholders(self):\n", " with tf.name_scope('data'):\n", " # input layer\n", " self.X=tf.placeholder(tf.float32, shape=(None, self.n_feats), name=\"X_data\")\n", " # target\n", " self.Y=tf.placeholder(tf.float32, shape=(None, self.n_categories), name=\"Y_data\")\n", " # p\n", " self.dropout_keepprob=tf.placeholder(tf.float32, name=\"keep_probability\")\n", "\n", "\n", " def _weight_variable(self, shape, name='', dtype=tf.float32):\n", " \"\"\"weight_variable generates a weight variable of a given shape.\"\"\"\n", " # weights are drawn from a normal distribution with std 0.1 and mean 0.\n", " initial = tf.truncated_normal(shape, stddev=0.1)\n", " return tf.Variable(initial, dtype=dtype, name=name)\n", "\n", "\n", " def _bias_variable(self, shape, name='', dtype=tf.float32):\n", " \"\"\"bias_variable generates a bias variable of a given shape.\"\"\"\n", " initial = tf.constant(0.1, shape=shape) \n", " return tf.Variable(initial, dtype=dtype, name=name)\n", "\n", " \n", " def create_DNN(self):\n", " with tf.name_scope('DNN'):\n", "\n", " # Fully connected layer\n", " W_fc1 = self._weight_variable([self.n_feats, self.deep_layer_neurons],name='fc1',dtype=tf.float32)\n", " b_fc1 = self._bias_variable([self.deep_layer_neurons],name='fc1',dtype=tf.float32)\n", "\n", " a_fc1 = tf.nn.relu(tf.matmul(self.X, W_fc1) + b_fc1)\n", "\n", " # Softmax layer (see loss function)\n", " W_fc2 = self._weight_variable([self.deep_layer_neurons, self.n_categories],name='fc2',dtype=tf.float32)\n", " b_fc2 = self._bias_variable([self.n_categories],name='fc2',dtype=tf.float32)\n", " \n", " self.Y_predicted = tf.matmul(a_fc1, W_fc2) + b_fc2\n", "\n", " \n", " def create_loss(self):\n", " with tf.name_scope('loss'):\n", " self.loss = tf.reduce_mean(\n", " tf.nn.softmax_cross_entropy_with_logits_v2(labels=self.Y, logits=self.Y_predicted)\n", " )\n", " # no need to use tf.stop_gradient() on labels because labels are placeholders and contain no params\n", " # to be optimized. Backprop will be applied only to the logits. \n", "\n", " def create_optimiser(self,opt_kwargs):\n", " with tf.name_scope('optimiser'):\n", " self.optimizer = tf.train.GradientDescentOptimizer(**opt_kwargs).minimize(self.loss,global_step=self.global_step) \n", " #self.optimizer = tf.train.AdamOptimizer(**kwargs).minimize(self.loss,global_step=self.global_step)\n", "\n", " def create_accuracy(self):\n", " with tf.name_scope('accuracy'):\n", " correct_prediction = tf.equal(tf.argmax(self.Y, 1), tf.argmax(self.Y_predicted, 1))\n", " correct_prediction = tf.cast(correct_prediction, tf.float64) # change data type\n", " self.accuracy = tf.reduce_mean(correct_prediction)\n", " \n", " \n", " \n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Steps 4+5: Train the Model and Evaluate its Performance\n", "\n", "We want to evaluate the performance of our model over a set of different learning rates, and a set of different hidden neurons, i.e. we consider a variable size of the hidden layer. Therefore, we create a function `evaluate_model` which trains and evaluates the performance of our DNN for a fixed number of hidden `neurons` and a fixed SGD learning rate `lr`, and returns the final loss and accuracy for the three data sets of interest.\n", "\n", "Apart from the number of `neurons` and the learning rate `lr`, `evaluate_model` accepts the data `Ising_Data`. This is done for convenience: loading the data is computationally expensive and we only need to do this once.\n", "\n", "We train our DNN using mini-batches of size $100$ over a total of $100$ epochs, which we define first. We then set up the optimizer parameter dictionary `opt_params`, and use it to create a DNN model. \n", "\n", "Running TensorFlow requires opening up a `Session` which we abbreviate as `sess` for short. All operations are performed in this session by calling the `run` method. First, we initialize the global variables in TensorFlow's computational graph by running the `global_variables_initializer`. To train the DNN, we loop over the number of epochs. In each fix epoch, we use the `next_batch` function of the `DataSet` class we defined above to create a mini-batch. The forward and backward passes through the weights are performed by running the `DNN.loss` and `DNN.optimizer` methods. To pass the mini-batch as well as any other external parameters, we use the `feed_dict` dictionary. Similarly, we evaluate the model performance, by running the `DNN.accuracy` function on the same minibatch data. Note that the dropout probability for testing is set to unity. \n", "\n", "Once we have exhausted all training epochs, we test the final performance on the entire training, test and critical data sets. This is done in the same way as above.\n", "\n", "Last, we return the loss and accuracy for each of the training, test and critical data sets." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "def evaluate_model(neurons, lr, Ising_Data, verbose):\n", " \"\"\"This function trains a DNN to solve the Ising classification problem\n", "\n", " neurons: number of hidden neurons\n", " lr: SGD learning rate\n", " Ising_Data: Ising data set\n", " verbose (bool): toggles output during the calculation \n", "\n", " \"\"\"\n", "\n", " training_epochs=100\n", " batch_size=100\n", "\n", " # SGD learning params\n", " opt_params=dict(learning_rate=lr)\n", "\n", " # create DNN\n", " DNN=model(neurons,opt_params)\n", "\n", " with tf.Session() as sess:\n", "\n", " # initialize the necessary variables, in this case, w and b\n", " sess.run(tf.global_variables_initializer())\n", "\n", " # train the DNN\n", " for epoch in range(training_epochs): \n", "\n", " batch_X, batch_Y = Ising_Data['train'].next_batch(batch_size,seed=seed)\n", "\n", " loss_batch, _ = sess.run([DNN.loss,DNN.optimizer], \n", " feed_dict={DNN.X: batch_X,\n", " DNN.Y: batch_Y, \n", " DNN.dropout_keepprob: 0.5} )\n", " accuracy = sess.run(DNN.accuracy, \n", " feed_dict={DNN.X: batch_X,\n", " DNN.Y: batch_Y, \n", " DNN.dropout_keepprob: 1.0} )\n", "\n", " # count training step\n", " step = sess.run(DNN.global_step)\n", "\n", "\n", " # test DNN performance on entire train test and critical data sets\n", " train_loss, train_accuracy = sess.run([DNN.loss, DNN.accuracy], \n", " feed_dict={DNN.X: Ising_Data['train'].data_X,\n", " DNN.Y: Ising_Data['train'].data_Y,\n", " DNN.dropout_keepprob: 0.5}\n", " )\n", " if verbose: print(\"train loss/accuracy:\", train_loss, train_accuracy)\n", "\n", " test_loss, test_accuracy = sess.run([DNN.loss, DNN.accuracy], \n", " feed_dict={DNN.X: Ising_Data['test'].data_X,\n", " DNN.Y: Ising_Data['test'].data_Y,\n", " DNN.dropout_keepprob: 1.0}\n", " )\n", "\n", " if verbose: print(\"test loss/accuracy:\", test_loss, test_accuracy)\n", "\n", " critical_loss, critical_accuracy = sess.run([DNN.loss, DNN.accuracy], \n", " feed_dict={DNN.X: Ising_Data['critical'].data_X,\n", " DNN.Y: Ising_Data['critical'].data_Y,\n", " DNN.dropout_keepprob: 1.0}\n", " )\n", " if verbose: print(\"crtitical loss/accuracy:\", critical_loss, critical_accuracy)\n", "\n", "\n", " return train_loss,train_accuracy,test_loss,test_accuracy,critical_loss,critical_accuracy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Step 6: Modify the Hyperparameters to Optimize Performance of the Model\n", "\n", "To study the dependence of our DNN on some of the hyperparameters, we do a grid search over the number of neurons in the hidden layer, and different SGD learning rates. As we explained in Sec. IX, these searches are best done over logarithmically-spaced points. \n", "\n", "Since we created the `evaluate_model` function with this in hindsight, below we simply loop over the grid values and call `evaluate_model`." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "def grid_search(verbose):\n", " \"\"\"This function performs a grid search over a set of different learning rates \n", " and a number of hidden layer neurons.\"\"\"\n", "\n", " # load Ising data\n", " Ising_Data = prepare_Ising_DNN()\n", " #Ising_Data=load_data()\n", "\n", " # perform grid search over learnign rate and number of hidden neurons\n", " N_neurons=np.logspace(0,3,4).astype('int') # check number of neurons over multiple decades\n", " learning_rates=np.logspace(-6,-1,6)\n", "\n", " # pre-alocate variables to store accuracy and loss data\n", " train_loss=np.zeros((len(N_neurons),len(learning_rates)),dtype=np.float64)\n", " train_accuracy=np.zeros_like(train_loss)\n", " test_loss=np.zeros_like(train_loss)\n", " test_accuracy=np.zeros_like(train_loss)\n", " critical_loss=np.zeros_like(train_loss)\n", " critical_accuracy=np.zeros_like(train_loss)\n", "\n", " # do grid search\n", " for i, neurons in enumerate(N_neurons):\n", " for j, lr in enumerate(learning_rates):\n", "\n", " print(\"training DNN with %4d neurons and SGD lr=%0.6f.\" %(neurons,lr) )\n", "\n", " train_loss[i,j],train_accuracy[i,j],\\\n", " test_loss[i,j],test_accuracy[i,j],\\\n", " critical_loss[i,j],critical_accuracy[i,j] = evaluate_model(neurons,lr,Ising_Data,verbose)\n", "\n", "\n", " plot_data(learning_rates,N_neurons,train_accuracy, 'training')\n", " plot_data(learning_rates,N_neurons,test_accuracy, 'testing')\n", " plot_data(learning_rates,N_neurons,critical_accuracy, 'critical')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To visualize the data, we used the function `plot_data`, defined below." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "%matplotlib notebook\n", "import matplotlib.pyplot as plt\n", "\n", "def plot_data(x,y,data,title=None):\n", "\n", " # plot results\n", " fontsize=16\n", "\n", "\n", " fig = plt.figure()\n", " ax = fig.add_subplot(111)\n", " cax = ax.matshow(data, interpolation='nearest', vmin=0, vmax=1)\n", " \n", " cbar=fig.colorbar(cax)\n", " cbar.ax.set_ylabel('accuracy (%)',rotation=90,fontsize=fontsize)\n", " cbar.set_ticks([0,.2,.4,0.6,0.8,1.0])\n", " cbar.set_ticklabels(['0%','20%','40%','60%','80%','100%'])\n", "\n", " # put text on matrix elements\n", " for i, x_val in enumerate(np.arange(len(x))):\n", " for j, y_val in enumerate(np.arange(len(y))):\n", " c = \"${0:.1f}\\\\%$\".format( 100*data[j,i]) \n", " ax.text(x_val, y_val, c, va='center', ha='center')\n", "\n", " # convert axis vaues to to string labels\n", " x=[str(i) for i in x]\n", " y=[str(i) for i in y]\n", "\n", "\n", " ax.set_xticklabels(['']+x)\n", " ax.set_yticklabels(['']+y)\n", "\n", " ax.set_xlabel('$\\\\mathrm{learning\\\\ rate}$',fontsize=fontsize)\n", " ax.set_ylabel('$\\\\mathrm{hidden\\\\ neurons}$',fontsize=fontsize)\n", " if title is not None:\n", " ax.set_title(title)\n", "\n", " plt.tight_layout()\n", "\n", " plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Run Code\n", "\n", "As we mentioned in the beginning of the notebook, all functions and classes discussed above only specify the procedure but do not actually perform any computations. This allows us to re-use them for different problems. \n", "\n", "Actually running the training and testing for every point in the grid search is done below." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Finished loading data\n", "training DNN with 1 neurons and SGD lr=0.000001.\n", "training DNN with 1 neurons and SGD lr=0.000010.\n", "training DNN with 1 neurons and SGD lr=0.000100.\n", "training DNN with 1 neurons and SGD lr=0.001000.\n", "training DNN with 1 neurons and SGD lr=0.010000.\n", "training DNN with 1 neurons and SGD lr=0.100000.\n", "training DNN with 10 neurons and SGD lr=0.000001.\n", "training DNN with 10 neurons and SGD lr=0.000010.\n", "training DNN with 10 neurons and SGD lr=0.000100.\n", "training DNN with 10 neurons and SGD lr=0.001000.\n", "training DNN with 10 neurons and SGD lr=0.010000.\n", "training DNN with 10 neurons and SGD lr=0.100000.\n", "training DNN with 100 neurons and SGD lr=0.000001.\n", "training DNN with 100 neurons and SGD lr=0.000010.\n", "training DNN with 100 neurons and SGD lr=0.000100.\n", "training DNN with 100 neurons and SGD lr=0.001000.\n", "training DNN with 100 neurons and SGD lr=0.010000.\n", "training DNN with 100 neurons and SGD lr=0.100000.\n", "training DNN with 1000 neurons and SGD lr=0.000001.\n", "training DNN with 1000 neurons and SGD lr=0.000010.\n", "training DNN with 1000 neurons and SGD lr=0.000100.\n", "training DNN with 1000 neurons and SGD lr=0.001000.\n", "training DNN with 1000 neurons and SGD lr=0.010000.\n", "training DNN with 1000 neurons and SGD lr=0.100000.\n" ] }, { "data": { "application/javascript": [ "/* Put everything inside the global mpl namespace */\n", "window.mpl = {};\n", "\n", "\n", "mpl.get_websocket_type = function() {\n", " if (typeof(WebSocket) !== 'undefined') {\n", " return WebSocket;\n", " } else if (typeof(MozWebSocket) !== 'undefined') {\n", " return MozWebSocket;\n", " } else {\n", " alert('Your browser does not have WebSocket support.' +\n", " 'Please try Chrome, Safari or Firefox ≥ 6. ' +\n", " 'Firefox 4 and 5 are also supported but you ' +\n", " 'have to enable WebSockets in about:config.');\n", " };\n", "}\n", "\n", "mpl.figure = function(figure_id, websocket, ondownload, parent_element) {\n", " this.id = figure_id;\n", "\n", " this.ws = websocket;\n", "\n", " this.supports_binary = (this.ws.binaryType != undefined);\n", "\n", " if (!this.supports_binary) {\n", " var warnings = document.getElementById(\"mpl-warnings\");\n", " if (warnings) {\n", " warnings.style.display = 'block';\n", " warnings.textContent = (\n", " \"This browser does not support binary websocket messages. \" +\n", " \"Performance may be slow.\");\n", " }\n", " }\n", "\n", " this.imageObj = new Image();\n", "\n", " this.context = undefined;\n", " this.message = undefined;\n", " this.canvas = undefined;\n", " this.rubberband_canvas = undefined;\n", " this.rubberband_context = undefined;\n", " this.format_dropdown = undefined;\n", "\n", " this.image_mode = 'full';\n", "\n", " this.root = $('
');\n", " this._root_extra_style(this.root)\n", " this.root.attr('style', 'display: inline-block');\n", "\n", " $(parent_element).append(this.root);\n", "\n", " this._init_header(this);\n", " this._init_canvas(this);\n", " this._init_toolbar(this);\n", "\n", " var fig = this;\n", "\n", " this.waiting = false;\n", "\n", " this.ws.onopen = function () {\n", " fig.send_message(\"supports_binary\", {value: fig.supports_binary});\n", " fig.send_message(\"send_image_mode\", {});\n", " if (mpl.ratio != 1) {\n", " fig.send_message(\"set_dpi_ratio\", {'dpi_ratio': mpl.ratio});\n", " }\n", " fig.send_message(\"refresh\", {});\n", " }\n", "\n", " this.imageObj.onload = function() {\n", " if (fig.image_mode == 'full') {\n", " // Full images could contain transparency (where diff images\n", " // almost always do), so we need to clear the canvas so that\n", " // there is no ghosting.\n", " fig.context.clearRect(0, 0, fig.canvas.width, fig.canvas.height);\n", " }\n", " fig.context.drawImage(fig.imageObj, 0, 0);\n", " };\n", "\n", " this.imageObj.onunload = function() {\n", " fig.ws.close();\n", " }\n", "\n", " this.ws.onmessage = this._make_on_message_function(this);\n", "\n", " this.ondownload = ondownload;\n", "}\n", "\n", "mpl.figure.prototype._init_header = function() {\n", " var titlebar = $(\n", " '
');\n", " var titletext = $(\n", " '
');\n", " titlebar.append(titletext)\n", " this.root.append(titlebar);\n", " this.header = titletext[0];\n", "}\n", "\n", "\n", "\n", "mpl.figure.prototype._canvas_extra_style = function(canvas_div) {\n", "\n", "}\n", "\n", "\n", "mpl.figure.prototype._root_extra_style = function(canvas_div) {\n", "\n", "}\n", "\n", "mpl.figure.prototype._init_canvas = function() {\n", " var fig = this;\n", "\n", " var canvas_div = $('
');\n", "\n", " canvas_div.attr('style', 'position: relative; clear: both; outline: 0');\n", "\n", " function canvas_keyboard_event(event) {\n", " return fig.key_event(event, event['data']);\n", " }\n", "\n", " canvas_div.keydown('key_press', canvas_keyboard_event);\n", " canvas_div.keyup('key_release', canvas_keyboard_event);\n", " this.canvas_div = canvas_div\n", " this._canvas_extra_style(canvas_div)\n", " this.root.append(canvas_div);\n", "\n", " var canvas = $('');\n", " canvas.addClass('mpl-canvas');\n", " canvas.attr('style', \"left: 0; top: 0; z-index: 0; outline: 0\")\n", "\n", " this.canvas = canvas[0];\n", " this.context = canvas[0].getContext(\"2d\");\n", "\n", " var backingStore = this.context.backingStorePixelRatio ||\n", "\tthis.context.webkitBackingStorePixelRatio ||\n", "\tthis.context.mozBackingStorePixelRatio ||\n", "\tthis.context.msBackingStorePixelRatio ||\n", "\tthis.context.oBackingStorePixelRatio ||\n", "\tthis.context.backingStorePixelRatio || 1;\n", "\n", " mpl.ratio = (window.devicePixelRatio || 1) / backingStore;\n", "\n", " var rubberband = $('');\n", " rubberband.attr('style', \"position: absolute; left: 0; top: 0; z-index: 1;\")\n", "\n", " var pass_mouse_events = true;\n", "\n", " canvas_div.resizable({\n", " start: function(event, ui) {\n", " pass_mouse_events = false;\n", " },\n", " resize: function(event, ui) {\n", " fig.request_resize(ui.size.width, ui.size.height);\n", " },\n", " stop: function(event, ui) {\n", " pass_mouse_events = true;\n", " fig.request_resize(ui.size.width, ui.size.height);\n", " },\n", " });\n", "\n", " function mouse_event_fn(event) {\n", " if (pass_mouse_events)\n", " return fig.mouse_event(event, event['data']);\n", " }\n", "\n", " rubberband.mousedown('button_press', mouse_event_fn);\n", " rubberband.mouseup('button_release', mouse_event_fn);\n", " // Throttle sequential mouse events to 1 every 20ms.\n", " rubberband.mousemove('motion_notify', mouse_event_fn);\n", "\n", " rubberband.mouseenter('figure_enter', mouse_event_fn);\n", " rubberband.mouseleave('figure_leave', mouse_event_fn);\n", "\n", " canvas_div.on(\"wheel\", function (event) {\n", " event = event.originalEvent;\n", " event['data'] = 'scroll'\n", " if (event.deltaY < 0) {\n", " event.step = 1;\n", " } else {\n", " event.step = -1;\n", " }\n", " mouse_event_fn(event);\n", " });\n", "\n", " canvas_div.append(canvas);\n", " canvas_div.append(rubberband);\n", "\n", " this.rubberband = rubberband;\n", " this.rubberband_canvas = rubberband[0];\n", " this.rubberband_context = rubberband[0].getContext(\"2d\");\n", " this.rubberband_context.strokeStyle = \"#000000\";\n", "\n", " this._resize_canvas = function(width, height) {\n", " // Keep the size of the canvas, canvas container, and rubber band\n", " // canvas in synch.\n", " canvas_div.css('width', width)\n", " canvas_div.css('height', height)\n", "\n", " canvas.attr('width', width * mpl.ratio);\n", " canvas.attr('height', height * mpl.ratio);\n", " canvas.attr('style', 'width: ' + width + 'px; height: ' + height + 'px;');\n", "\n", " rubberband.attr('width', width);\n", " rubberband.attr('height', height);\n", " }\n", "\n", " // Set the figure to an initial 600x600px, this will subsequently be updated\n", " // upon first draw.\n", " this._resize_canvas(600, 600);\n", "\n", " // Disable right mouse context menu.\n", " $(this.rubberband_canvas).bind(\"contextmenu\",function(e){\n", " return false;\n", " });\n", "\n", " function set_focus () {\n", " canvas.focus();\n", " canvas_div.focus();\n", " }\n", "\n", " window.setTimeout(set_focus, 100);\n", "}\n", "\n", "mpl.figure.prototype._init_toolbar = function() {\n", " var fig = this;\n", "\n", " var nav_element = $('
')\n", " nav_element.attr('style', 'width: 100%');\n", " this.root.append(nav_element);\n", "\n", " // Define a callback function for later on.\n", " function toolbar_event(event) {\n", " return fig.toolbar_button_onclick(event['data']);\n", " }\n", " function toolbar_mouse_event(event) {\n", " return fig.toolbar_button_onmouseover(event['data']);\n", " }\n", "\n", " for(var toolbar_ind in mpl.toolbar_items) {\n", " var name = mpl.toolbar_items[toolbar_ind][0];\n", " var tooltip = mpl.toolbar_items[toolbar_ind][1];\n", " var image = mpl.toolbar_items[toolbar_ind][2];\n", " var method_name = mpl.toolbar_items[toolbar_ind][3];\n", "\n", " if (!name) {\n", " // put a spacer in here.\n", " continue;\n", " }\n", " var button = $('');\n", " button.click(method_name, toolbar_event);\n", " button.mouseover(tooltip, toolbar_mouse_event);\n", " nav_element.append(button);\n", " }\n", "\n", " // Add the status bar.\n", " var status_bar = $('');\n", " nav_element.append(status_bar);\n", " this.message = status_bar[0];\n", "\n", " // Add the close button to the window.\n", " var buttongrp = $('
');\n", " var button = $('');\n", " button.click(function (evt) { fig.handle_close(fig, {}); } );\n", " button.mouseover('Stop Interaction', toolbar_mouse_event);\n", " buttongrp.append(button);\n", " var titlebar = this.root.find($('.ui-dialog-titlebar'));\n", " titlebar.prepend(buttongrp);\n", "}\n", "\n", "mpl.figure.prototype._root_extra_style = function(el){\n", " var fig = this\n", " el.on(\"remove\", function(){\n", "\tfig.close_ws(fig, {});\n", " });\n", "}\n", "\n", "mpl.figure.prototype._canvas_extra_style = function(el){\n", " // this is important to make the div 'focusable\n", " el.attr('tabindex', 0)\n", " // reach out to IPython and tell the keyboard manager to turn it's self\n", " // off when our div gets focus\n", "\n", " // location in version 3\n", " if (IPython.notebook.keyboard_manager) {\n", " IPython.notebook.keyboard_manager.register_events(el);\n", " }\n", " else {\n", " // location in version 2\n", " IPython.keyboard_manager.register_events(el);\n", " }\n", "\n", "}\n", "\n", "mpl.figure.prototype._key_event_extra = function(event, name) {\n", " var manager = IPython.notebook.keyboard_manager;\n", " if (!manager)\n", " manager = IPython.keyboard_manager;\n", "\n", " // Check for shift+enter\n", " if (event.shiftKey && event.which == 13) {\n", " this.canvas_div.blur();\n", " event.shiftKey = false;\n", " // Send a \"J\" for go to next cell\n", " event.which = 74;\n", " event.keyCode = 74;\n", " manager.command_mode();\n", " manager.handle_keydown(event);\n", " }\n", "}\n", "\n", "mpl.figure.prototype.handle_save = function(fig, msg) {\n", " fig.ondownload(fig, null);\n", "}\n", "\n", "\n", "mpl.find_output_cell = function(html_output) {\n", " // Return the cell and output element which can be found *uniquely* in the notebook.\n", " // Note - this is a bit hacky, but it is done because the \"notebook_saving.Notebook\"\n", " // IPython event is triggered only after the cells have been serialised, which for\n", " // our purposes (turning an active figure into a static one), is too late.\n", " var cells = IPython.notebook.get_cells();\n", " var ncells = cells.length;\n", " for (var i=0; i= 3 moved mimebundle to data attribute of output\n", " data = data.data;\n", " }\n", " if (data['text/html'] == html_output) {\n", " return [cell, data, j];\n", " }\n", " }\n", " }\n", " }\n", "}\n", "\n", "// Register the function which deals with the matplotlib target/channel.\n", "// The kernel may be null if the page has been refreshed.\n", "if (IPython.notebook.kernel != null) {\n", " IPython.notebook.kernel.comm_manager.register_target('matplotlib', mpl.mpl_figure_comm);\n", "}\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/javascript": [ "/* Put everything inside the global mpl namespace */\n", "window.mpl = {};\n", "\n", "\n", "mpl.get_websocket_type = function() {\n", " if (typeof(WebSocket) !== 'undefined') {\n", " return WebSocket;\n", " } else if (typeof(MozWebSocket) !== 'undefined') {\n", " return MozWebSocket;\n", " } else {\n", " alert('Your browser does not have WebSocket support.' +\n", " 'Please try Chrome, Safari or Firefox ≥ 6. ' +\n", " 'Firefox 4 and 5 are also supported but you ' +\n", " 'have to enable WebSockets in about:config.');\n", " };\n", "}\n", "\n", "mpl.figure = function(figure_id, websocket, ondownload, parent_element) {\n", " this.id = figure_id;\n", "\n", " this.ws = websocket;\n", "\n", " this.supports_binary = (this.ws.binaryType != undefined);\n", "\n", " if (!this.supports_binary) {\n", " var warnings = document.getElementById(\"mpl-warnings\");\n", " if (warnings) {\n", " warnings.style.display = 'block';\n", " warnings.textContent = (\n", " \"This browser does not support binary websocket messages. \" +\n", " \"Performance may be slow.\");\n", " }\n", " }\n", "\n", " this.imageObj = new Image();\n", "\n", " this.context = undefined;\n", " this.message = undefined;\n", " this.canvas = undefined;\n", " this.rubberband_canvas = undefined;\n", " this.rubberband_context = undefined;\n", " this.format_dropdown = undefined;\n", "\n", " this.image_mode = 'full';\n", "\n", " this.root = $('
');\n", " this._root_extra_style(this.root)\n", " this.root.attr('style', 'display: inline-block');\n", "\n", " $(parent_element).append(this.root);\n", "\n", " this._init_header(this);\n", " this._init_canvas(this);\n", " this._init_toolbar(this);\n", "\n", " var fig = this;\n", "\n", " this.waiting = false;\n", "\n", " this.ws.onopen = function () {\n", " fig.send_message(\"supports_binary\", {value: fig.supports_binary});\n", " fig.send_message(\"send_image_mode\", {});\n", " if (mpl.ratio != 1) {\n", " fig.send_message(\"set_dpi_ratio\", {'dpi_ratio': mpl.ratio});\n", " }\n", " fig.send_message(\"refresh\", {});\n", " }\n", "\n", " this.imageObj.onload = function() {\n", " if (fig.image_mode == 'full') {\n", " // Full images could contain transparency (where diff images\n", " // almost always do), so we need to clear the canvas so that\n", " // there is no ghosting.\n", " fig.context.clearRect(0, 0, fig.canvas.width, fig.canvas.height);\n", " }\n", " fig.context.drawImage(fig.imageObj, 0, 0);\n", " };\n", "\n", " this.imageObj.onunload = function() {\n", " fig.ws.close();\n", " }\n", "\n", " this.ws.onmessage = this._make_on_message_function(this);\n", "\n", " this.ondownload = ondownload;\n", "}\n", "\n", "mpl.figure.prototype._init_header = function() {\n", " var titlebar = $(\n", " '
');\n", " var titletext = $(\n", " '
');\n", " titlebar.append(titletext)\n", " this.root.append(titlebar);\n", " this.header = titletext[0];\n", "}\n", "\n", "\n", "\n", "mpl.figure.prototype._canvas_extra_style = function(canvas_div) {\n", "\n", "}\n", "\n", "\n", "mpl.figure.prototype._root_extra_style = function(canvas_div) {\n", "\n", "}\n", "\n", "mpl.figure.prototype._init_canvas = function() {\n", " var fig = this;\n", "\n", " var canvas_div = $('
');\n", "\n", " canvas_div.attr('style', 'position: relative; clear: both; outline: 0');\n", "\n", " function canvas_keyboard_event(event) {\n", " return fig.key_event(event, event['data']);\n", " }\n", "\n", " canvas_div.keydown('key_press', canvas_keyboard_event);\n", " canvas_div.keyup('key_release', canvas_keyboard_event);\n", " this.canvas_div = canvas_div\n", " this._canvas_extra_style(canvas_div)\n", " this.root.append(canvas_div);\n", "\n", " var canvas = $('');\n", " canvas.addClass('mpl-canvas');\n", " canvas.attr('style', \"left: 0; top: 0; z-index: 0; outline: 0\")\n", "\n", " this.canvas = canvas[0];\n", " this.context = canvas[0].getContext(\"2d\");\n", "\n", " var backingStore = this.context.backingStorePixelRatio ||\n", "\tthis.context.webkitBackingStorePixelRatio ||\n", "\tthis.context.mozBackingStorePixelRatio ||\n", "\tthis.context.msBackingStorePixelRatio ||\n", "\tthis.context.oBackingStorePixelRatio ||\n", "\tthis.context.backingStorePixelRatio || 1;\n", "\n", " mpl.ratio = (window.devicePixelRatio || 1) / backingStore;\n", "\n", " var rubberband = $('');\n", " rubberband.attr('style', \"position: absolute; left: 0; top: 0; z-index: 1;\")\n", "\n", " var pass_mouse_events = true;\n", "\n", " canvas_div.resizable({\n", " start: function(event, ui) {\n", " pass_mouse_events = false;\n", " },\n", " resize: function(event, ui) {\n", " fig.request_resize(ui.size.width, ui.size.height);\n", " },\n", " stop: function(event, ui) {\n", " pass_mouse_events = true;\n", " fig.request_resize(ui.size.width, ui.size.height);\n", " },\n", " });\n", "\n", " function mouse_event_fn(event) {\n", " if (pass_mouse_events)\n", " return fig.mouse_event(event, event['data']);\n", " }\n", "\n", " rubberband.mousedown('button_press', mouse_event_fn);\n", " rubberband.mouseup('button_release', mouse_event_fn);\n", " // Throttle sequential mouse events to 1 every 20ms.\n", " rubberband.mousemove('motion_notify', mouse_event_fn);\n", "\n", " rubberband.mouseenter('figure_enter', mouse_event_fn);\n", " rubberband.mouseleave('figure_leave', mouse_event_fn);\n", "\n", " canvas_div.on(\"wheel\", function (event) {\n", " event = event.originalEvent;\n", " event['data'] = 'scroll'\n", " if (event.deltaY < 0) {\n", " event.step = 1;\n", " } else {\n", " event.step = -1;\n", " }\n", " mouse_event_fn(event);\n", " });\n", "\n", " canvas_div.append(canvas);\n", " canvas_div.append(rubberband);\n", "\n", " this.rubberband = rubberband;\n", " this.rubberband_canvas = rubberband[0];\n", " this.rubberband_context = rubberband[0].getContext(\"2d\");\n", " this.rubberband_context.strokeStyle = \"#000000\";\n", "\n", " this._resize_canvas = function(width, height) {\n", " // Keep the size of the canvas, canvas container, and rubber band\n", " // canvas in synch.\n", " canvas_div.css('width', width)\n", " canvas_div.css('height', height)\n", "\n", " canvas.attr('width', width * mpl.ratio);\n", " canvas.attr('height', height * mpl.ratio);\n", " canvas.attr('style', 'width: ' + width + 'px; height: ' + height + 'px;');\n", "\n", " rubberband.attr('width', width);\n", " rubberband.attr('height', height);\n", " }\n", "\n", " // Set the figure to an initial 600x600px, this will subsequently be updated\n", " // upon first draw.\n", " this._resize_canvas(600, 600);\n", "\n", " // Disable right mouse context menu.\n", " $(this.rubberband_canvas).bind(\"contextmenu\",function(e){\n", " return false;\n", " });\n", "\n", " function set_focus () {\n", " canvas.focus();\n", " canvas_div.focus();\n", " }\n", "\n", " window.setTimeout(set_focus, 100);\n", "}\n", "\n", "mpl.figure.prototype._init_toolbar = function() {\n", " var fig = this;\n", "\n", " var nav_element = $('
')\n", " nav_element.attr('style', 'width: 100%');\n", " this.root.append(nav_element);\n", "\n", " // Define a callback function for later on.\n", " function toolbar_event(event) {\n", " return fig.toolbar_button_onclick(event['data']);\n", " }\n", " function toolbar_mouse_event(event) {\n", " return fig.toolbar_button_onmouseover(event['data']);\n", " }\n", "\n", " for(var toolbar_ind in mpl.toolbar_items) {\n", " var name = mpl.toolbar_items[toolbar_ind][0];\n", " var tooltip = mpl.toolbar_items[toolbar_ind][1];\n", " var image = mpl.toolbar_items[toolbar_ind][2];\n", " var method_name = mpl.toolbar_items[toolbar_ind][3];\n", "\n", " if (!name) {\n", " // put a spacer in here.\n", " continue;\n", " }\n", " var button = $('