Finally I finished the code refactoring to create a structure similar to Torch’s NN package in PyFunt.

Star PyFunt

Modules and Containers

Now PyFunt, has a modulized structure (like Torch), all layers’ implementations are subclasses of the abstract Module class. To create a sequence of layers you can use containers, subclasses of the abstract Container class (subclass of Module), as Sequential and Parallel.

Like in torch, if you want to create a new layer, you have to derive from Module and implement the update_output, update_grad_input and eventually acc_grad_parameters.

With this powerful structure building a parametric residual network could be easy as the below (took from https://github.com/dnlcrl/deep-residual-networks-pyfunt):

from pyfunt import (SpatialConvolution, SpatialBatchNormalization,
	SpatialAveragePooling, Sequential, ReLU, Linear,
	Reshape, LogSoftMax, Padding, Identity, ConcatTable,
	CAddTable)

def residual_layer(n_channels, n_out_channels=None, stride=None):
		n_out_channels = n_out_channels or n_channels
		stride = stride or 1

		convs = Sequential()
		add = convs.add
		add(SpatialConvolution(
		n_channels, n_out_channels, 3, 3, stride, stride, 1, 1))
		add(SpatialBatchNormalization(n_out_channels))
		add(SpatialConvolution(n_out_channels, n_out_channels, 3, 3, 1, 1, 1, 1))
		add(SpatialBatchNormalization(n_out_channels))

		if stride > 1:
		shortcut = Sequential()
		shortcut.add(SpatialAveragePooling(2, 2, stride, stride))
		shortcut.add(Padding(1, (n_out_channels - n_channels)/2, 3))
		else:
		shortcut = Identity()

		res = Sequential()
		res.add(ConcatTable().add(convs).add(shortcut)).add(CAddTable())
		res.add(ReLU(True))
		return res


def resnet(n_size, num_starting_filters, reg):
		nfs = num_starting_filters
		model = Sequential()
		add = model.add
		add(SpatialConvolution(3, nfs, 3, 3, 1, 1, 1, 1))
		add(SpatialBatchNormalization(nfs))
		add(ReLU())

		for i in xrange(1, n_size):
		add(residual_layer(nfs))
		add(residual_layer(nfs, 2*nfs, 2))

		for i in xrange(1, n_size-1):
		add(residual_layer(2*nfs))
		add(residual_layer(2*nfs, 4*nfs, 2))

		for i in xrange(1, n_size-1):
		add(residual_layer(4*nfs))

		add(SpatialAveragePooling(8, 8))
		add(Reshape(nfs*4))
		add(Linear(nfs*4, 10))
		add(LogSoftMax())
		return model

Loading t7 Models

Thank to @bshillingford’ python-torchfile I implemented an utility to load t7 models in PyFunt, with pyfunt/utils/load_torch_model.py you can nt only load models saved with torch but also checkpoints, and you can also implement a new layer, implement the loading/value setting functions and update the dicts containing the class-load_function relations and it’s done.

Please keep in mind that not all the modules are loadable because I have not written the loading/value setting functions for every Module, but you can see how simple it is to write one here: load_torch_model.py, so if you have for example to implement a new Layer or you get an error ilke this:

class <X> not found
If you want you can fix it and make a pull request ;)

it means that you can fix the problem with the minimum effort.

  • If you are implementing a new fancy Module of yours and don’t want to make a pull request for PyFunt, you can import load_parser_init and load_parser_vals from pyfunt.utils, create the fancy_module_init and fancy_module_vals functions and update the load_parser_init and load_parser_vals dictionaries with relations as 'FancyModule': fancy_module_init.
  • If the class is implemented in PyFunt, add the functions and relations in utils/load_torch_model.py and make a pull request.
  • If you ported a module from Torch and want to make a pull request, please add also the init/vals function and relations in utils/load_torch_model.py.

You can look here for an example on how to create/load new modules, in particular, lines #L92-L127 show how to load the custom classes InstanceNormalization, ShaveImage and TotalVariation.

Network Testing

This is not a big news but pyfunt/utils there is gradient_check.py, provided by Stanford’s CS231N course (like a lot of code for the layers), you can use the function in this file to test your layer or your networks, by evaluating the numerical gradient and the relative error with the implemented forward/backward steps. For example:

x = np.random.randn(3, 4, 8, 8)
# x = np.random.randn(3, 2, 8, 8)
dout = np.random.randn(3, 10)
pool_param = {'pool_height': 2, 'pool_width': 2, 'stride': 2}

s = Sequential()
s.add(SpatialConvolution(4, 2, 1, 1, 1, 1))
s.add(SpatialAveragePooling(2, 2, 2, 2, 0, 0))
s.add(SpatialBatchNormalization(2))
s.add(ReLU())
s.add(Reshape(2*4*4))
s.add(Linear(2*4*4, 10))
s.add(LogSoftMax())

dx_num = eval_numerical_gradient_array(lambda x: s.update_output(x), x, dout)

out = s.update_output(x)
dx = s.update_grad_input(x, dout)
# Your error should be around 1e-8
print('Testing net backward function:')
print('dx error: ', rel_error(dx, dx_num))

Examples Folder

I added the examples directory, where I will add some example of implementations and usage. Currently you can see how to build and train a residual network on CIFAR-10 and how to test a model with the testing utilities.

Conclusions

This may not be a deep learning framework for production usage, but it helped me to understand how Torch works and the different deep learning approaches to several AI problems.

I hope you can learn something too and if you want you can help me with the development by creating a pull request without worries.

Fork