PyFunt Updates - Modules and Containers, Loading t7 Models, Network Testing and More
Finally I finished the code refactoring to create a structure similar to Torch’s NN package in PyFunt.
Modules and Containers
Now PyFunt, has a modulized structure (like Torch), all layers’ implementations are subclasses of the abstract Module class. To create a sequence of layers you can use containers, subclasses of the abstract Container class (subclass of Module), as Sequential and Parallel.
Like in torch, if you want to create a new layer, you have to derive from Module and implement the update_output, update_grad_input and eventually acc_grad_parameters.
With this powerful structure building a parametric residual network could be easy as the below (took from https://github.com/dnlcrl/deep-residual-networks-pyfunt):
from pyfunt import (SpatialConvolution, SpatialBatchNormalization,
SpatialAveragePooling, Sequential, ReLU, Linear,
Reshape, LogSoftMax, Padding, Identity, ConcatTable,
CAddTable)
def residual_layer(n_channels, n_out_channels=None, stride=None):
n_out_channels = n_out_channels or n_channels
stride = stride or 1
convs = Sequential()
add = convs.add
add(SpatialConvolution(
n_channels, n_out_channels, 3, 3, stride, stride, 1, 1))
add(SpatialBatchNormalization(n_out_channels))
add(SpatialConvolution(n_out_channels, n_out_channels, 3, 3, 1, 1, 1, 1))
add(SpatialBatchNormalization(n_out_channels))
if stride > 1:
shortcut = Sequential()
shortcut.add(SpatialAveragePooling(2, 2, stride, stride))
shortcut.add(Padding(1, (n_out_channels - n_channels)/2, 3))
else:
shortcut = Identity()
res = Sequential()
res.add(ConcatTable().add(convs).add(shortcut)).add(CAddTable())
res.add(ReLU(True))
return res
def resnet(n_size, num_starting_filters, reg):
nfs = num_starting_filters
model = Sequential()
add = model.add
add(SpatialConvolution(3, nfs, 3, 3, 1, 1, 1, 1))
add(SpatialBatchNormalization(nfs))
add(ReLU())
for i in xrange(1, n_size):
add(residual_layer(nfs))
add(residual_layer(nfs, 2*nfs, 2))
for i in xrange(1, n_size-1):
add(residual_layer(2*nfs))
add(residual_layer(2*nfs, 4*nfs, 2))
for i in xrange(1, n_size-1):
add(residual_layer(4*nfs))
add(SpatialAveragePooling(8, 8))
add(Reshape(nfs*4))
add(Linear(nfs*4, 10))
add(LogSoftMax())
return model
Loading t7 Models
Thank to @bshillingford’ python-torchfile I implemented an utility to load t7 models in PyFunt, with pyfunt/utils/load_torch_model.py
you can nt only load models saved with torch but also checkpoints, and you can also implement a new layer, implement the loading/value setting functions and update the dicts containing the class-load_function relations and it’s done.
Please keep in mind that not all the modules are loadable because I have not written the loading/value setting functions for every Module, but you can see how simple it is to write one here: load_torch_model.py, so if you have for example to implement a new Layer or you get an error ilke this:
class <X> not found
If you want you can fix it and make a pull request ;)
it means that you can fix the problem with the minimum effort.
- If you are implementing a new fancy Module of yours and don’t want to make a pull request for PyFunt, you can import
load_parser_init
andload_parser_vals
frompyfunt.utils
, create thefancy_module_init
andfancy_module_vals
functions and update theload_parser_init
andload_parser_vals
dictionaries with relations as'FancyModule': fancy_module_init
. - If the class is implemented in PyFunt, add the functions and relations in
utils/load_torch_model.py
and make a pull request. - If you ported a module from Torch and want to make a pull request, please add also the init/vals function and relations in
utils/load_torch_model.py
.
You can look here for an example on how to create/load new modules, in particular, lines #L92-L127 show how to load the custom classes InstanceNormalization, ShaveImage and TotalVariation.
Network Testing
This is not a big news but pyfunt/utils
there is gradient_check.py
, provided by Stanford’s CS231N course (like a lot of code for the layers), you can use the function in this file to test your layer or your networks, by evaluating the numerical gradient and the relative error with the implemented forward/backward steps. For example:
x = np.random.randn(3, 4, 8, 8)
# x = np.random.randn(3, 2, 8, 8)
dout = np.random.randn(3, 10)
pool_param = {'pool_height': 2, 'pool_width': 2, 'stride': 2}
s = Sequential()
s.add(SpatialConvolution(4, 2, 1, 1, 1, 1))
s.add(SpatialAveragePooling(2, 2, 2, 2, 0, 0))
s.add(SpatialBatchNormalization(2))
s.add(ReLU())
s.add(Reshape(2*4*4))
s.add(Linear(2*4*4, 10))
s.add(LogSoftMax())
dx_num = eval_numerical_gradient_array(lambda x: s.update_output(x), x, dout)
out = s.update_output(x)
dx = s.update_grad_input(x, dout)
# Your error should be around 1e-8
print('Testing net backward function:')
print('dx error: ', rel_error(dx, dx_num))
Examples Folder
I added the examples directory, where I will add some example of implementations and usage. Currently you can see how to build and train a residual network on CIFAR-10 and how to test a model with the testing utilities.
Conclusions
This may not be a deep learning framework for production usage, but it helped me to understand how Torch works and the different deep learning approaches to several AI problems.
I hope you can learn something too and if you want you can help me with the development by creating a pull request without worries.