Notebook 18: Ising Configurations using Deep Boltzmann Machines

Learning Goal

The goal of this notebook is to teach readers how to generate examples using Deep Boltzmann Machines in the Paysage package. The reader should understand why generating new examples is much tougher than classifying, as well as become more acquainted with pre-training using DBMs.

Overview

The goal of this notebook is to show how one can employ Generative Models to learn a variational approximation to the probability distribution used to draw thermal spin configurations in the 2D Ising model.

The Hamiltonian for the classical Ising model is given by

$$ H = -J\sum_{\langle ij\rangle}S_{i}S_j,\qquad \qquad S_j\in\{\pm 1\} $$

where the lattice site indices $i,j$ run over all nearest neighbors of a $40\times 40$ 2D square lattice, and $J$ is some arbitrary interaction energy scale. We adopt periodic boundary conditions. Onsager proved that this model undergoes a phase transition in the thermodynamic limit from an ordered ferromagnet with all spins aligned to a disordered phase at the critical temperature $T_c/J=2/\log(1+\sqrt{2})\approx 2.26$. For any finite system size, this critical point is expanded to a critical region around $T_c$.

In previous notebooks, we used our knowledge of the critical point at $T_c/J\approx 2.26$ to label the spin configurations and study the problem of classifying the states according to their phase of matter. However, in more complicated models, where the precise position of $T_c$ is not known, one cannot label the states with such an accuracy, if at all.

As we explained in Secs. XV and XVI of the review, generative models can be used to learn a variational approximation for the probability distribution that generated the data points. By using only the 2D spin configurations, we now want to train a deep Bernoulli Boltzmann machine, the fantasy particles of which are thermal Ising configurations.

Unlike in previous studies of the Ising dataset, here we perform the analysis at a fixed temperature $T$. We can then apply our model at three different values $T=1.75,2.25,2.75$ in the ordered, critical and disordered regions, respectively.

Setting up Paysage

In this notebook, we use an open-source python package for energy-based models, called paysage. Paysage requires python>3.5; we recommend using the package with an Anaconda environment.

To install paysage:

  • clone or download the github repo
  • activate an Anaconda3 environment
  • navigate to the directory which contains the paysage files
  • and execute
    pip install .

Documentation for paysage is available under https://github.com/drckf/paysage/tree/master/docs.

Paysage on GPUs

By default, computations in paysage are performed using numpy/numexpr/numba on the CPU. Since the coputation below on the Ising dataset is more intensive compared to MNIST, we want to make use of a GPU speedup.

Not all laptops have GPUs available for computaion, but large computing facilities, such as supercomputing clusters, do. If you do not have access to a GPU, you can still run the code below with parameters corresponding to those used to generate the figurs in Sec. XVI F, but prepare for larger waiting times. As discussed in the main text, energy-based models rely on Monte-Carlo inspired methods rather than backpropagation. For this reason, they tend to be more computationally expensive than generative models.

To make use of GPU power, you need to install PyTorch, and switch to the pytorch backend by changing the setting in paysage/backends/config.json to pytorch.

Let us set up the required packages for this notebook by importing the relevant paysage modules.

Loading the Required Packages

In [1]:
import os
import pickle
import numpy as np

# for plotting
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.gridspec as gs
import matplotlib.cm as cm
import seaborn as sns
import ml_style as style
import matplotlib as mpl

mpl.rcParams.update(style.style)

# for Boltzmann machines
from paysage import preprocess as pre
from paysage.layers import BernoulliLayer, GaussianLayer
from paysage.models import BoltzmannMachine
from paysage import batch
from paysage import fit
from paysage import optimizers
from paysage import samplers
from paysage import backends as be
from paysage import schedules
from paysage import penalties as pen

# fix random seed to ensure deterministic behavior
be.set_seed(137)
Running paysage with the python backend on the cpu

Loading the Data

To load the Ising data set, we use two functions: unpack_data() loads the Ising configurations from a pickled file, and casts them in a suitable shape.

The function Load_Ising_Dataset() loads the data for three fixed temperatures (see file names) out of the temperature set T, corresponding to the ordered, critial and disordered phases. These three datasets are then shuffled and returned in the form of a dictionary.

In [2]:
def unpack_data(path_to_data, data_name):
    """
    Get the data from a pickled file.

    Args:
        path_to_data (str)
        data_name (str)

    Returns:
        numpy.ndarray

    """
    # this file contains 16*10000 samples taken in T=np.arange(0.25,4.0001,0.25)
    # pickle reads the file and returns the Python object (1D array, compressed bits)
    with open(os.path.join(path_to_data, data_name), 'rb') as infile:
        data = pickle.load(infile)
    # Decompress array and reshape for convenience
    data = np.unpackbits(data).reshape(-1, 1600).astype('int')
    return data


def Load_Ising_Dataset():
    """
    Loads the Ising dataset.

    Args:
        None

    Returns:
        dict[numpy.ndarray]

    """
    L=40 # linear system size
    T=np.linspace(0.25,4.0,16) # temperatures
    T_c=2.26 # critical temperature in the TD limit

    # path to data directory
    path_to_data = 'IsingData'
    #path_to_data=os.path.expanduser('~')+'/Dropbox/MachineLearningReview/Datasets/isingMC/'

    # ordered states
    data_name_ordered = "Ising2DFM_reSample_L40_T=1.75.pkl"
    X_ordered = unpack_data(path_to_data,data_name_ordered)
    # critical states
    data_name_critical = "Ising2DFM_reSample_L40_T=2.25.pkl"
    X_critical = unpack_data(path_to_data,data_name_critical)
    # disordered states
    data_name_disordered = "Ising2DFM_reSample_L40_T=2.75.pkl"
    X_disordered = unpack_data(path_to_data,data_name_disordered)

    # shuffle data
    np.random.shuffle(X_ordered)
    np.random.shuffle(X_critical)
    np.random.shuffle(X_disordered)

    return {'ordered': X_ordered, 'critical': X_critical, 'disordered': X_disordered}

Define Auxiliary Functions

To help set up the numerical experiment, we make use of the functions we defined in the notebook NB_CXVI_RBM_mnist. Since their functionality is explained in detail there and in Sec. XVI E of the review, we simply state the code here.

In [3]:
def ADAM_optimizer(initial, coefficient):
    """
    Convenience function to set up an ADAM optimizer.

    Args:
        initial (float): learning rate to start with
        coefficient (float): coefficient that determines the rate of
            learning rate decay (larger -> faster decay)

    Returns:
        ADAM

    """
    # define learning rate attenuation schedule
    learning_rate = schedules.PowerLawDecay(initial=initial, coefficient=coefficient)
    return optimizers.ADAM(stepsize=learning_rate)


def train_model(model, data, num_epochs, monte_carlo_steps):
    """
    Train a model.

    Args:
        model (BoltzmannMachine)
        data (Batch)
        num_epochs (int)
        monte_carlo_steps (int)

    Returns:
        None

    """
    is_deep = model.num_layers > 2
    model.initialize(data,method='glorot_normal')
    opt = ADAM_optimizer(initial,coefficient)
    if is_deep:
        print("layerwise pretraining")
        pretrainer=fit.LayerwisePretrain(model,data)
        pretrainer.train(opt, num_epochs, method=fit.pcd, mcsteps=monte_carlo_steps, init_method="glorot_normal")
        # reset the optimizer using a lower learning rate
        opt = ADAM_optimizer(initial/10.0, coefficient)
    print("use persistent contrastive divergence to fit the model")
    trainer=fit.SGD(model,data)
    trainer.train(opt,num_epochs,method=fit.pcd,mcsteps=monte_carlo_steps)


def compute_reconstructions(model, data):
    """
    Computes reconstructions of the input data.
    Input v -> h -> v' (one pass up one pass down)

    Args:
        model: a model
        data: a tensor of shape (num_samples, num_visible_units)

    Returns:
        tensor of shape (num_samples, num_visible_units)

    """
    recons = model.compute_reconstructions(data).get_visible()
    return be.to_numpy_array(recons)


def compute_fantasy_particles(model,num_fantasy,num_steps,mean_field=True):
    """
    Draws samples from the model using Gibbs sampling Markov Chain Monte Carlo .
    Starts from randomly initialized points.

    Args:
        model: a model
        data: a tensor of shape (num_samples, num_visible_units)
        num_steps (int): the number of update steps
        mean_field (bool; optional): run a final mean field step to compute probabilities

    Returns:
        tensor of shape (num_samples, num_visible_units)

    """
    schedule = schedules.Linear(initial=1.0, delta = 1 / (num_steps-1))
    fantasy = samplers.SequentialMC.generate_fantasy_state(model,
                                                           num_fantasy,
                                                           num_steps,
                                                           schedule=schedule,
                                                           beta_std=0.0,
                                                           beta_momentum=0.0)
    if mean_field:
        fantasy = model.mean_field_iteration(1, fantasy)
    fantasy_particles = fantasy.get_visible()        
    return be.to_numpy_array(fantasy_particles)


def plot_image_grid(image_array, shape, vmin=0, vmax=1, cmap=cm.gray_r,
                    row_titles=None, filename=None):
    """
    Plot a grid of images.

    Args:
        image_array (numpy.ndarray)
        shape (tuple)
        vmin (optional; float)
        vmax (optional; float)
        cmap (optional; colormap)
        row_titles (optional; List[str])
        filename (optional; str)

    Returns:
        None

    """
    array = be.to_numpy_array(image_array)
    nrows, ncols = array.shape[:-1]
    f = plt.figure(figsize=(2*ncols, 2*nrows))
    grid = gs.GridSpec(nrows, ncols)
    axes = [[plt.subplot(grid[i,j]) for j in range(ncols)] for i in range(nrows)]
    for i in range(nrows):
        for j in range(ncols):
            sns.heatmap(np.reshape(array[i][j], shape),
                ax=axes[i][j], cmap=cmap, cbar=False, vmin=vmin, vmax=vmax)
            axes[i][j].set(yticks=[])
            axes[i][j].set(xticks=[])

    if row_titles is not None:
        for i in range(nrows):
            axes[i][0].set_ylabel(row_titles[i], fontsize=36)

    plt.tight_layout()
    plt.show(f)
    if filename is not None:
        f.savefig(filename)
    plt.close(f)

Building and Training the Deep Boltzmann Machine for the Ising model

To study this problem, we construct a deep Boltzmann machine with two hidden layers of $80$ and $8$ units each. We apply L1 normalization to all weights of the model.

To train our DBM, we use ADAM-based Persistent Contrastive Divergence.

More detailed explanations about how to use Paysage to construct (deep) generative models, can be found in notebook NB_CXVI_RBM_mnist, and Sec. XVI E of the review.

In [4]:
image_shape = (40,40) # 40x40=1600 spins in every configuration
num_to_plot = 8 # of data points to plot

# parameters the user needs to choose
batch_size = 100 # batch size
num_epochs = 10 # training epochs
monte_carlo_steps = 2 # number of MC sampling steps
initial = 1E-3 # initial learning rate
coefficient = 1.0 # controls learning rate decay
num_fantasy_steps = 100 # MC steps when drawing fantasy particles
lmbda = 1E-6 # stength of the L1 penalty
num_hidden_units = [80, 8] # hidden layer units

# load data
data =  Load_Ising_Dataset()

# preallocate data dicts
dbm_L1_reconstructions = {}
dbm_L1_fantasy = {}
true_examples = {}
dbm_models = {}

for phase in ['ordered','critical','disordered']:
    print('training in the {} phase'.format(phase))

    # set up an object to read minibatch of the data
    transform = pre.Transformation()
    batch_reader = batch.in_memory_batch(data[phase], batch_size, train_fraction=0.95, transform=transform)
    batch_reader.reset_generator(mode='train')

    ##### Bernoulli RBM
    dbm_L1 = BoltzmannMachine(
            [BernoulliLayer(batch_reader.ncols)] + \
            [BernoulliLayer(n) for n in num_hidden_units]
            )

    # add an L1 penalty to the weights
    for j_, conn in enumerate(dbm_L1.connections):
        conn.weights.add_penalty({'matrix': pen.l1_penalty(lmbda)})

    # train the model
    train_model(dbm_L1, batch_reader, num_epochs, monte_carlo_steps)

    # store model
    dbm_models[phase]=dbm_L1

    # reset the generator to the beginning of the validation set
    batch_reader.reset_generator(mode='validate')
    examples = batch_reader.get(mode='validate') # shape (batch_size, 1600)
    true_examples[phase] = examples[:num_to_plot]

    # compute reconstructions
    reconstructions = compute_reconstructions(dbm_L1, true_examples[phase])
    dbm_L1_reconstructions[phase] = reconstructions

    # compute fantasy particles
    fantasy_particles = compute_fantasy_particles(dbm_L1,
                                                  num_to_plot,
                                                  num_fantasy_steps,
                                                  mean_field=False)
    dbm_L1_fantasy[phase] = fantasy_particles

    # plot results and save fig
    reconstruction_plot = plot_image_grid(
            np.array([
                    true_examples[phase],
                    dbm_L1_reconstructions[phase],
                    dbm_L1_fantasy[phase]
                    ]),
            image_shape, vmin=0, vmax=1,
            row_titles=["data", "reconst", "fantasy"],
            filename='DBM_Ising-'+phase+'.png')

# save data
save_file_name='./DBM_ising_training_data-L=40.pkl'
pickle.dump([dbm_models, true_examples, dbm_L1_fantasy, dbm_L1_reconstructions,
            image_shape, num_to_plot, batch_size, num_epochs, monte_carlo_steps,
            initial, coefficient, num_fantasy_steps, lmbda,num_hidden_units,
            ], open(save_file_name, "wb" ) )
training in the ordered phase
layerwise pretraining
training model 0

Before training:
-ReconstructionError: 1.416018
-EnergyCoefficient: 0.312046
-HeatCapacity: 0.093238
-WeightSparsity: 0.332238
-WeightSquare: 1.901263
-KLDivergence: 1.418246
-ReverseKLDivergence: 0.021467

End of epoch 1: 
Time elapsed 2.392s
-ReconstructionError: 1.379553
-EnergyCoefficient: 0.307597
-HeatCapacity: 0.057619
-WeightSparsity: 0.332574
-WeightSquare: 1.937900
-KLDivergence: 1.417736
-ReverseKLDivergence: 0.009967

End of epoch 2: 
Time elapsed 2.006s
-ReconstructionError: 1.378414
-EnergyCoefficient: 0.307451
-HeatCapacity: 0.057168
-WeightSparsity: 0.332711
-WeightSquare: 1.927264
-KLDivergence: 1.417745
-ReverseKLDivergence: 0.010077

End of epoch 3: 
Time elapsed 2.223s
-ReconstructionError: 1.376378
-EnergyCoefficient: 0.307472
-HeatCapacity: 0.055935
-WeightSparsity: 0.332605
-WeightSquare: 1.923876
-KLDivergence: 1.417233
-ReverseKLDivergence: 0.011251

End of epoch 4: 
Time elapsed 2.024s
-ReconstructionError: 1.373554
-EnergyCoefficient: 0.307749
-HeatCapacity: 0.053438
-WeightSparsity: 0.332602
-WeightSquare: 1.923925
-KLDivergence: 1.417157
-ReverseKLDivergence: 0.011339

End of epoch 5: 
Time elapsed 2.067s
-ReconstructionError: 1.373257
-EnergyCoefficient: 0.307675
-HeatCapacity: 0.049568
-WeightSparsity: 0.332324
-WeightSquare: 1.924588
-KLDivergence: 1.416038
-ReverseKLDivergence: 0.012267

End of epoch 6: 
Time elapsed 2.271s
-ReconstructionError: 1.373893
-EnergyCoefficient: 0.307736
-HeatCapacity: 0.051387
-WeightSparsity: 0.331965
-WeightSquare: 1.926347
-KLDivergence: 1.417908
-ReverseKLDivergence: 0.014040

End of epoch 7: 
Time elapsed 2.151s
-ReconstructionError: 1.372088
-EnergyCoefficient: 0.307758
-HeatCapacity: 0.050337
-WeightSparsity: 0.331588
-WeightSquare: 1.930513
-KLDivergence: 1.418421
-ReverseKLDivergence: 0.013712

End of epoch 8: 
Time elapsed 1.969s
-ReconstructionError: 1.372142
-EnergyCoefficient: 0.307809
-HeatCapacity: 0.048689
-WeightSparsity: 0.331065
-WeightSquare: 1.936446
-KLDivergence: 1.417220
-ReverseKLDivergence: 0.013795

End of epoch 9: 
Time elapsed 1.957s
-ReconstructionError: 1.370387
-EnergyCoefficient: 0.307870
-HeatCapacity: 0.062417
-WeightSparsity: 0.330443
-WeightSquare: 1.944099
-KLDivergence: 1.417128
-ReverseKLDivergence: 0.014291

End of epoch 10: 
Time elapsed 1.941s
-ReconstructionError: 1.369075
-EnergyCoefficient: 0.307832
-HeatCapacity: 0.046418
-WeightSparsity: 0.329509
-WeightSquare: 1.953275
-KLDivergence: 1.417435
-ReverseKLDivergence: 0.014749

training model 1

Before training:
-ReconstructionError: 1.847496
-EnergyCoefficient: 0.507495
-HeatCapacity: 0.284584
-WeightSparsity: 0.334511
-WeightSquare: 1.689175
-KLDivergence: 6.589260
-ReverseKLDivergence: -0.200949

End of epoch 1: 
Time elapsed 0.526s
-ReconstructionError: 1.787965
-EnergyCoefficient: 0.476779
-HeatCapacity: 0.203586
-WeightSparsity: 0.362549
-WeightSquare: 1.765384
-KLDivergence: 6.572429
-ReverseKLDivergence: -0.221725

End of epoch 2: 
Time elapsed 0.534s
-ReconstructionError: 1.675233
-EnergyCoefficient: 0.449412
-HeatCapacity: 0.172845
-WeightSparsity: 0.385844
-WeightSquare: 2.007614
-KLDivergence: 6.541926
-ReverseKLDivergence: -0.214976

End of epoch 3: 
Time elapsed 0.524s
-ReconstructionError: 1.523965
-EnergyCoefficient: 0.420117
-HeatCapacity: 0.145816
-WeightSparsity: 0.418526
-WeightSquare: 2.423471
-KLDivergence: 6.474742
-ReverseKLDivergence: -0.185631

End of epoch 4: 
Time elapsed 0.527s
-ReconstructionError: 1.411560
-EnergyCoefficient: 0.400275
-HeatCapacity: 0.185296
-WeightSparsity: 0.439076
-WeightSquare: 2.874234
-KLDivergence: 6.401207
-ReverseKLDivergence: -0.165587

End of epoch 5: 
Time elapsed 0.519s
-ReconstructionError: 1.312838
-EnergyCoefficient: 0.374927
-HeatCapacity: 0.379550
-WeightSparsity: 0.454840
-WeightSquare: 3.360993
-KLDivergence: 6.293105
-ReverseKLDivergence: -0.151064

End of epoch 6: 
Time elapsed 0.52s
-ReconstructionError: 1.253359
-EnergyCoefficient: 0.366056
-HeatCapacity: 0.544755
-WeightSparsity: 0.469570
-WeightSquare: 3.819495
-KLDivergence: 6.209590
-ReverseKLDivergence: -0.185873

End of epoch 7: 
Time elapsed 0.521s
-ReconstructionError: 1.201659
-EnergyCoefficient: 0.350541
-HeatCapacity: 0.896285
-WeightSparsity: 0.481907
-WeightSquare: 4.346788
-KLDivergence: 6.114139
-ReverseKLDivergence: -0.190155

End of epoch 8: 
Time elapsed 0.526s
-ReconstructionError: 1.166295
-EnergyCoefficient: 0.360977
-HeatCapacity: 1.198606
-WeightSparsity: 0.494238
-WeightSquare: 4.820575
-KLDivergence: 6.042021
-ReverseKLDivergence: -0.212867

End of epoch 9: 
Time elapsed 0.531s
-ReconstructionError: 1.123683
-EnergyCoefficient: 0.358994
-HeatCapacity: 1.750758
-WeightSparsity: 0.505190
-WeightSquare: 5.354546
-KLDivergence: 5.997106
-ReverseKLDivergence: -0.221060

End of epoch 10: 
Time elapsed 0.531s
-ReconstructionError: 1.089719
-EnergyCoefficient: 0.388677
-HeatCapacity: 1.719922
-WeightSparsity: 0.516168
-WeightSquare: 5.806932
-KLDivergence: 5.977907
-ReverseKLDivergence: -0.232858

use persistent contrastive divergence to fit the model
Before training:
-ReconstructionError: 1.384621
-EnergyCoefficient: 0.307611
-HeatCapacity: 0.021748
-WeightSparsity: 0.329509
-WeightSquare: 1.953275
-KLDivergence: 1.417578
-ReverseKLDivergence: 0.013598

End of epoch 1: 
Time elapsed 1.627s
-ReconstructionError: 1.375602
-EnergyCoefficient: 0.307841
-HeatCapacity: 0.014072
-WeightSparsity: 0.330181
-WeightSquare: 2.012930
-KLDivergence: 1.418471
-ReverseKLDivergence: 0.015064

End of epoch 2: 
Time elapsed 1.611s
-ReconstructionError: 1.373938
-EnergyCoefficient: 0.307639
-HeatCapacity: 0.011750
-WeightSparsity: 0.330667
-WeightSquare: 2.051635
-KLDivergence: 1.418747
-ReverseKLDivergence: 0.014487

End of epoch 3: 
Time elapsed 1.612s
-ReconstructionError: 1.369542
-EnergyCoefficient: 0.307737
-HeatCapacity: 0.009947
-WeightSparsity: 0.330934
-WeightSquare: 2.075672
-KLDivergence: 1.418514
-ReverseKLDivergence: 0.014764

End of epoch 4: 
Time elapsed 1.605s
-ReconstructionError: 1.367383
-EnergyCoefficient: 0.307688
-HeatCapacity: 0.009424
-WeightSparsity: 0.331193
-WeightSquare: 2.096537
-KLDivergence: 1.418550
-ReverseKLDivergence: 0.014307

End of epoch 5: 
Time elapsed 1.601s
-ReconstructionError: 1.365041
-EnergyCoefficient: 0.307657
-HeatCapacity: 0.010809
-WeightSparsity: 0.331288
-WeightSquare: 2.110878
-KLDivergence: 1.418112
-ReverseKLDivergence: 0.014794

End of epoch 6: 
Time elapsed 1.839s
-ReconstructionError: 1.363319
-EnergyCoefficient: 0.307765
-HeatCapacity: 0.014756
-WeightSparsity: 0.331485
-WeightSquare: 2.128010
-KLDivergence: 1.418123
-ReverseKLDivergence: 0.014296

End of epoch 7: 
Time elapsed 1.789s
-ReconstructionError: 1.363269
-EnergyCoefficient: 0.307684
-HeatCapacity: 0.012619
-WeightSparsity: 0.331478
-WeightSquare: 2.133768
-KLDivergence: 1.418506
-ReverseKLDivergence: 0.015521

End of epoch 8: 
Time elapsed 1.764s
-ReconstructionError: 1.361441
-EnergyCoefficient: 0.307873
-HeatCapacity: 0.011803
-WeightSparsity: 0.331544
-WeightSquare: 2.145938
-KLDivergence: 1.418212
-ReverseKLDivergence: 0.015166

End of epoch 9: 
Time elapsed 1.827s
-ReconstructionError: 1.359982
-EnergyCoefficient: 0.307686
-HeatCapacity: 0.012059
-WeightSparsity: 0.331523
-WeightSquare: 2.151562
-KLDivergence: 1.419503
-ReverseKLDivergence: 0.015737

End of epoch 10: 
Time elapsed 1.838s
-ReconstructionError: 1.358325
-EnergyCoefficient: 0.307772
-HeatCapacity: 0.012598
-WeightSparsity: 0.331595
-WeightSquare: 2.161634
-KLDivergence: 1.418562
-ReverseKLDivergence: 0.015386

training in the critical phase
layerwise pretraining
training model 0

Before training:
-ReconstructionError: 1.415932
-EnergyCoefficient: 0.168717
-HeatCapacity: 0.086845
-WeightSparsity: 0.335200
-WeightSquare: 1.903544
-KLDivergence: 0.418750
-ReverseKLDivergence: 0.016154

End of epoch 1: 
Time elapsed 1.621s
-ReconstructionError: 1.406319
-EnergyCoefficient: 0.163837
-HeatCapacity: 0.086406
-WeightSparsity: 0.334989
-WeightSquare: 1.912629
-KLDivergence: 0.419178
-ReverseKLDivergence: 0.009721

End of epoch 2: 
Time elapsed 1.765s
-ReconstructionError: 1.406428
-EnergyCoefficient: 0.163416
-HeatCapacity: 0.095853
-WeightSparsity: 0.334077
-WeightSquare: 1.936154
-KLDivergence: 0.418361
-ReverseKLDivergence: 0.010860

End of epoch 3: 
Time elapsed 1.769s
-ReconstructionError: 1.404644
-EnergyCoefficient: 0.163221
-HeatCapacity: 0.089165
-WeightSparsity: 0.331146
-WeightSquare: 1.995651
-KLDivergence: 0.418375
-ReverseKLDivergence: 0.010820

End of epoch 4: 
Time elapsed 1.764s
-ReconstructionError: 1.403115
-EnergyCoefficient: 0.163061
-HeatCapacity: 0.103023
-WeightSparsity: 0.322988
-WeightSquare: 2.121297
-KLDivergence: 0.418421
-ReverseKLDivergence: 0.011584

End of epoch 5: 
Time elapsed 1.766s
-ReconstructionError: 1.401116
-EnergyCoefficient: 0.162706
-HeatCapacity: 0.090711
-WeightSparsity: 0.308076
-WeightSquare: 2.375455
-KLDivergence: 0.418526
-ReverseKLDivergence: 0.011441

End of epoch 6: 
Time elapsed 1.882s
-ReconstructionError: 1.396247
-EnergyCoefficient: 0.162316
-HeatCapacity: 0.097823
-WeightSparsity: 0.290417
-WeightSquare: 2.813981
-KLDivergence: 0.418140
-ReverseKLDivergence: 0.011047

End of epoch 7: 
Time elapsed 1.764s
-ReconstructionError: 1.391057
-EnergyCoefficient: 0.162318
-HeatCapacity: 0.156150
-WeightSparsity: 0.277064
-WeightSquare: 3.387026
-KLDivergence: 0.417023
-ReverseKLDivergence: 0.011452

End of epoch 8: 
Time elapsed 1.773s
-ReconstructionError: 1.386753
-EnergyCoefficient: 0.164018
-HeatCapacity: 0.096086
-WeightSparsity: 0.271732
-WeightSquare: 3.903247
-KLDivergence: 0.418223
-ReverseKLDivergence: 0.010849

End of epoch 9: 
Time elapsed 1.789s
-ReconstructionError: 1.383381
-EnergyCoefficient: 0.164432
-HeatCapacity: 0.104761
-WeightSparsity: 0.270478
-WeightSquare: 4.339828
-KLDivergence: 0.417762
-ReverseKLDivergence: 0.011049

End of epoch 10: 
Time elapsed 1.767s
-ReconstructionError: 1.382252
-EnergyCoefficient: 0.164712
-HeatCapacity: 0.238932
-WeightSparsity: 0.271573
-WeightSquare: 4.715862
-KLDivergence: 0.415905
-ReverseKLDivergence: 0.011317

training model 1

Before training:
-ReconstructionError: 1.794070
-EnergyCoefficient: 0.466996
-HeatCapacity: 0.234012
-WeightSparsity: 0.300811
-WeightSquare: 1.756950
-KLDivergence: 1.553442
-ReverseKLDivergence: -0.096339

End of epoch 1: 
Time elapsed 0.571s
-ReconstructionError: 1.727621
-EnergyCoefficient: 0.427378
-HeatCapacity: 0.155453
-WeightSparsity: 0.308928
-WeightSquare: 1.792456
-KLDivergence: 1.516493
-ReverseKLDivergence: -0.134326

End of epoch 2: 
Time elapsed 0.573s
-ReconstructionError: 1.638799
-EnergyCoefficient: 0.394371
-HeatCapacity: 0.117594
-WeightSparsity: 0.330185
-WeightSquare: 1.969587
-KLDivergence: 1.474517
-ReverseKLDivergence: -0.145711

End of epoch 3: 
Time elapsed 0.565s
-ReconstructionError: 1.548471
-EnergyCoefficient: 0.360888
-HeatCapacity: 0.114263
-WeightSparsity: 0.346174
-WeightSquare: 2.224576
-KLDivergence: 1.424795
-ReverseKLDivergence: -0.143040

End of epoch 4: 
Time elapsed 0.561s
-ReconstructionError: 1.484413
-EnergyCoefficient: 0.320946
-HeatCapacity: 0.119142
-WeightSparsity: 0.358682
-WeightSquare: 2.491204
-KLDivergence: 1.365482
-ReverseKLDivergence: -0.137105

End of epoch 5: 
Time elapsed 0.558s
-ReconstructionError: 1.434320
-EnergyCoefficient: 0.288636
-HeatCapacity: 0.225017
-WeightSparsity: 0.371516
-WeightSquare: 2.731192
-KLDivergence: 1.311713
-ReverseKLDivergence: -0.126853

End of epoch 6: 
Time elapsed 0.563s
-ReconstructionError: 1.384970
-EnergyCoefficient: 0.261125
-HeatCapacity: 0.318864
-WeightSparsity: 0.384397
-WeightSquare: 2.951550
-KLDivergence: 1.249885
-ReverseKLDivergence: -0.115450

End of epoch 7: 
Time elapsed 0.568s
-ReconstructionError: 1.357032
-EnergyCoefficient: 0.242608
-HeatCapacity: 0.443690
-WeightSparsity: 0.396455
-WeightSquare: 3.120169
-KLDivergence: 1.209868
-ReverseKLDivergence: -0.121128

End of epoch 8: 
Time elapsed 0.566s
-ReconstructionError: 1.335232
-EnergyCoefficient: 0.227526
-HeatCapacity: 0.603513
-WeightSparsity: 0.406631
-WeightSquare: 3.283245
-KLDivergence: 1.171637
-ReverseKLDivergence: -0.127378

End of epoch 9: 
Time elapsed 0.563s
-ReconstructionError: 1.310408
-EnergyCoefficient: 0.218963
-HeatCapacity: 0.752361
-WeightSparsity: 0.416695
-WeightSquare: 3.467413
-KLDivergence: 1.139205
-ReverseKLDivergence: -0.128760

End of epoch 10: 
Time elapsed 0.563s
-ReconstructionError: 1.302709
-EnergyCoefficient: 0.209014
-HeatCapacity: 0.922406
-WeightSparsity: 0.425585
-WeightSquare: 3.608723
-KLDivergence: 1.112624
-ReverseKLDivergence: -0.138681

use persistent contrastive divergence to fit the model
Before training:
-ReconstructionError: 1.387465
-EnergyCoefficient: 0.164625
-HeatCapacity: 0.071721
-WeightSparsity: 0.271573
-WeightSquare: 4.715862
-KLDivergence: 0.418873
-ReverseKLDivergence: 0.011377

End of epoch 1: 
Time elapsed 1.602s
-ReconstructionError: 1.387398
-EnergyCoefficient: 0.164290
-HeatCapacity: 0.066539
-WeightSparsity: 0.273881
-WeightSquare: 4.846795
-KLDivergence: 0.418016
-ReverseKLDivergence: 0.009853

End of epoch 2: 
Time elapsed 1.597s
-ReconstructionError: 1.389336
-EnergyCoefficient: 0.163973
-HeatCapacity: 0.061145
-WeightSparsity: 0.275646
-WeightSquare: 4.955679
-KLDivergence: 0.418730
-ReverseKLDivergence: 0.010060

End of epoch 3: 
Time elapsed 1.598s
-ReconstructionError: 1.386184
-EnergyCoefficient: 0.164008
-HeatCapacity: 0.060462
-WeightSparsity: 0.277244
-WeightSquare: 5.054227
-KLDivergence: 0.418704
-ReverseKLDivergence: 0.009643

End of epoch 4: 
Time elapsed 1.591s
-ReconstructionError: 1.385169
-EnergyCoefficient: 0.164049
-HeatCapacity: 0.062364
-WeightSparsity: 0.278765
-WeightSquare: 5.151093
-KLDivergence: 0.417621
-ReverseKLDivergence: 0.009391

End of epoch 5: 
Time elapsed 1.624s
-ReconstructionError: 1.384383
-EnergyCoefficient: 0.163983
-HeatCapacity: 0.057981
-WeightSparsity: 0.280205
-WeightSquare: 5.242036
-KLDivergence: 0.418790
-ReverseKLDivergence: 0.008976

End of epoch 6: 
Time elapsed 1.727s
-ReconstructionError: 1.384648
-EnergyCoefficient: 0.164681
-HeatCapacity: 0.055776
-WeightSparsity: 0.281468
-WeightSquare: 5.326574
-KLDivergence: 0.418375
-ReverseKLDivergence: 0.008905

End of epoch 7: 
Time elapsed 1.589s
-ReconstructionError: 1.383661
-EnergyCoefficient: 0.163952
-HeatCapacity: 0.065393
-WeightSparsity: 0.282735
-WeightSquare: 5.407551
-KLDivergence: 0.418462
-ReverseKLDivergence: 0.009416

End of epoch 8: 
Time elapsed 1.593s
-ReconstructionError: 1.381536
-EnergyCoefficient: 0.163919
-HeatCapacity: 0.055674
-WeightSparsity: 0.283956
-WeightSquare: 5.491248
-KLDivergence: 0.418470
-ReverseKLDivergence: 0.008893

End of epoch 9: 
Time elapsed 1.588s
-ReconstructionError: 1.380898
-EnergyCoefficient: 0.164533
-HeatCapacity: 0.060710
-WeightSparsity: 0.285176
-WeightSquare: 5.570600
-KLDivergence: 0.419187
-ReverseKLDivergence: 0.009425

End of epoch 10: 
Time elapsed 1.614s
-ReconstructionError: 1.380922
-EnergyCoefficient: 0.164201
-HeatCapacity: 0.063200
-WeightSparsity: 0.286492
-WeightSquare: 5.658150
-KLDivergence: 0.417993
-ReverseKLDivergence: 0.008283

training in the disordered phase
layerwise pretraining
training model 0

Before training:
-ReconstructionError: 1.410001
-EnergyCoefficient: 0.110492
-HeatCapacity: 0.092223
-WeightSparsity: 0.337057
-WeightSquare: 1.902251
-KLDivergence: 0.030492
-ReverseKLDivergence: 0.014561

End of epoch 1: 
Time elapsed 1.76s
-ReconstructionError: 1.413065
-EnergyCoefficient: 0.100761
-HeatCapacity: 0.086829
-WeightSparsity: 0.335294
-WeightSquare: 1.943940
-KLDivergence: 0.030426
-ReverseKLDivergence: 0.010565

End of epoch 2: 
Time elapsed 1.832s
-ReconstructionError: 1.411849
-EnergyCoefficient: 0.100620
-HeatCapacity: 0.074392
-WeightSparsity: 0.332099
-WeightSquare: 2.016817
-KLDivergence: 0.030483
-ReverseKLDivergence: 0.010048

End of epoch 3: 
Time elapsed 1.841s
-ReconstructionError: 1.404504
-EnergyCoefficient: 0.101234
-HeatCapacity: 0.050446
-WeightSparsity: 0.319267
-WeightSquare: 2.294243
-KLDivergence: 0.030219
-ReverseKLDivergence: 0.010214

End of epoch 4: 
Time elapsed 1.817s
-ReconstructionError: 1.398299
-EnergyCoefficient: 0.101688
-HeatCapacity: 0.081227
-WeightSparsity: 0.301608
-WeightSquare: 2.780976
-KLDivergence: 0.030040
-ReverseKLDivergence: 0.010416

End of epoch 5: 
Time elapsed 1.819s
-ReconstructionError: 1.390320
-EnergyCoefficient: 0.105802
-HeatCapacity: 0.152750
-WeightSparsity: 0.288190
-WeightSquare: 3.385402
-KLDivergence: 0.029708
-ReverseKLDivergence: 0.010952

End of epoch 6: 
Time elapsed 1.834s
-ReconstructionError: 1.386955
-EnergyCoefficient: 0.106775
-HeatCapacity: 0.221597
-WeightSparsity: 0.282430
-WeightSquare: 3.976031
-KLDivergence: 0.029335
-ReverseKLDivergence: 0.011462

End of epoch 7: 
Time elapsed 1.826s
-ReconstructionError: 1.380671
-EnergyCoefficient: 0.107693
-HeatCapacity: 0.250585
-WeightSparsity: 0.280978
-WeightSquare: 4.538808
-KLDivergence: 0.028238
-ReverseKLDivergence: 0.011508

End of epoch 8: 
Time elapsed 1.831s
-ReconstructionError: 1.377370
-EnergyCoefficient: 0.109179
-HeatCapacity: 0.332766
-WeightSparsity: 0.282399
-WeightSquare: 5.091576
-KLDivergence: 0.027742
-ReverseKLDivergence: 0.011566

End of epoch 9: 
Time elapsed 1.85s
-ReconstructionError: 1.373979
-EnergyCoefficient: 0.118749
-HeatCapacity: 0.386149
-WeightSparsity: 0.284935
-WeightSquare: 5.627777
-KLDivergence: 0.028864
-ReverseKLDivergence: 0.014665

End of epoch 10: 
Time elapsed 1.847s
-ReconstructionError: 1.370511
-EnergyCoefficient: 0.106519
-HeatCapacity: 0.393450
-WeightSparsity: 0.287157
-WeightSquare: 6.144994
-KLDivergence: 0.026811
-ReverseKLDivergence: 0.010174

training model 1

Before training:
-ReconstructionError: 1.637121
-EnergyCoefficient: 0.191978
-HeatCapacity: 0.119853
-WeightSparsity: 0.355607
-WeightSquare: 1.941890
-KLDivergence: 0.298255
-ReverseKLDivergence: -0.069274

End of epoch 1: 
Time elapsed 0.526s
-ReconstructionError: 1.628203
-EnergyCoefficient: 0.165000
-HeatCapacity: 0.053757
-WeightSparsity: 0.364478
-WeightSquare: 1.817653
-KLDivergence: 0.294797
-ReverseKLDivergence: -0.089236

End of epoch 2: 
Time elapsed 0.527s
-ReconstructionError: 1.630671
-EnergyCoefficient: 0.154347
-HeatCapacity: 0.049968
-WeightSparsity: 0.367344
-WeightSquare: 1.722828
-KLDivergence: 0.290888
-ReverseKLDivergence: -0.097758

End of epoch 3: 
Time elapsed 0.526s
-ReconstructionError: 1.625440
-EnergyCoefficient: 0.151741
-HeatCapacity: 0.040635
-WeightSparsity: 0.370629
-WeightSquare: 1.653807
-KLDivergence: 0.290901
-ReverseKLDivergence: -0.100941

End of epoch 4: 
Time elapsed 0.529s
-ReconstructionError: 1.631598
-EnergyCoefficient: 0.148916
-HeatCapacity: 0.043386
-WeightSparsity: 0.373490
-WeightSquare: 1.600897
-KLDivergence: 0.290563
-ReverseKLDivergence: -0.103002

End of epoch 5: 
Time elapsed 0.523s
-ReconstructionError: 1.631681
-EnergyCoefficient: 0.147096
-HeatCapacity: 0.044617
-WeightSparsity: 0.375211
-WeightSquare: 1.558694
-KLDivergence: 0.289884
-ReverseKLDivergence: -0.103878

End of epoch 6: 
Time elapsed 0.527s
-ReconstructionError: 1.619865
-EnergyCoefficient: 0.146219
-HeatCapacity: 0.045863
-WeightSparsity: 0.376418
-WeightSquare: 1.530677
-KLDivergence: 0.289061
-ReverseKLDivergence: -0.104829

End of epoch 7: 
Time elapsed 0.524s
-ReconstructionError: 1.615215
-EnergyCoefficient: 0.147237
-HeatCapacity: 0.040232
-WeightSparsity: 0.376653
-WeightSquare: 1.512036
-KLDivergence: 0.289880
-ReverseKLDivergence: -0.101434

End of epoch 8: 
Time elapsed 0.527s
-ReconstructionError: 1.616639
-EnergyCoefficient: 0.144586
-HeatCapacity: 0.046140
-WeightSparsity: 0.378086
-WeightSquare: 1.507200
-KLDivergence: 0.289696
-ReverseKLDivergence: -0.104417

End of epoch 9: 
Time elapsed 0.524s
-ReconstructionError: 1.616187
-EnergyCoefficient: 0.145014
-HeatCapacity: 0.045080
-WeightSparsity: 0.377884
-WeightSquare: 1.513952
-KLDivergence: 0.289545
-ReverseKLDivergence: -0.105390

End of epoch 10: 
Time elapsed 0.527s
-ReconstructionError: 1.610943
-EnergyCoefficient: 0.144724
-HeatCapacity: 0.042013
-WeightSparsity: 0.377583
-WeightSquare: 1.528702
-KLDivergence: 0.285813
-ReverseKLDivergence: -0.105255

use persistent contrastive divergence to fit the model
Before training:
-ReconstructionError: 1.369201
-EnergyCoefficient: 0.108127
-HeatCapacity: 0.357266
-WeightSparsity: 0.287157
-WeightSquare: 6.144994
-KLDivergence: 0.026900
-ReverseKLDivergence: 0.010947

End of epoch 1: 
Time elapsed 1.709s
-ReconstructionError: 1.370256
-EnergyCoefficient: 0.101817
-HeatCapacity: 0.340879
-WeightSparsity: 0.288013
-WeightSquare: 6.221397
-KLDivergence: 0.026062
-ReverseKLDivergence: 0.008972

End of epoch 2: 
Time elapsed 1.723s
-ReconstructionError: 1.368565
-EnergyCoefficient: 0.100916
-HeatCapacity: 0.373686
-WeightSparsity: 0.288643
-WeightSquare: 6.287891
-KLDivergence: 0.025863
-ReverseKLDivergence: 0.008142

End of epoch 3: 
Time elapsed 1.714s
-ReconstructionError: 1.369267
-EnergyCoefficient: 0.101824
-HeatCapacity: 0.407277
-WeightSparsity: 0.289272
-WeightSquare: 6.348709
-KLDivergence: 0.025783
-ReverseKLDivergence: 0.007974

End of epoch 4: 
Time elapsed 1.712s
-ReconstructionError: 1.369020
-EnergyCoefficient: 0.100957
-HeatCapacity: 0.393685
-WeightSparsity: 0.289869
-WeightSquare: 6.407252
-KLDivergence: 0.025918
-ReverseKLDivergence: 0.007682

End of epoch 5: 
Time elapsed 1.707s
-ReconstructionError: 1.366876
-EnergyCoefficient: 0.101797
-HeatCapacity: 0.385496
-WeightSparsity: 0.290476
-WeightSquare: 6.466989
-KLDivergence: 0.025767
-ReverseKLDivergence: 0.008114

End of epoch 6: 
Time elapsed 1.716s
-ReconstructionError: 1.367778
-EnergyCoefficient: 0.101060
-HeatCapacity: 0.384008
-WeightSparsity: 0.291032
-WeightSquare: 6.523547
-KLDivergence: 0.025948
-ReverseKLDivergence: 0.007574

End of epoch 7: 
Time elapsed 1.721s
-ReconstructionError: 1.367335
-EnergyCoefficient: 0.100704
-HeatCapacity: 0.400341
-WeightSparsity: 0.291557
-WeightSquare: 6.578193
-KLDivergence: 0.025721
-ReverseKLDivergence: 0.007819

End of epoch 8: 
Time elapsed 1.721s
-ReconstructionError: 1.366027
-EnergyCoefficient: 0.101421
-HeatCapacity: 0.382580
-WeightSparsity: 0.292064
-WeightSquare: 6.629922
-KLDivergence: 0.025839
-ReverseKLDivergence: 0.008045

End of epoch 9: 
Time elapsed 1.716s
-ReconstructionError: 1.364656
-EnergyCoefficient: 0.100726
-HeatCapacity: 0.373966
-WeightSparsity: 0.292563
-WeightSquare: 6.684762
-KLDivergence: 0.025740
-ReverseKLDivergence: 0.007831

End of epoch 10: 
Time elapsed 1.717s
-ReconstructionError: 1.366732
-EnergyCoefficient: 0.102005
-HeatCapacity: 0.355404
-WeightSparsity: 0.293075
-WeightSquare: 6.737060
-KLDivergence: 0.025891
-ReverseKLDivergence: 0.008087

Excercises

  • Pick different temperatures $T$ from the available set and repeat the learning procedure. Can you find more suitable hyperparameters (such as the number of hidden units, the learning rate, the reglarization strength, and the SGD optimizer parameters) that give better results?
  • Generate a large enough set of fantasy particles and compute their magnetization, energy (and other thermodynamic quantities). Comparing these values to the original MC samples provides a useful measure for the performance of the DBM.
  • You can now play the following game: use the RBM to generate a large sample of Ising states, and then apply your pre-trained DNN or CNN classifier from Secs IX and XI to label them.