.. momentumnet documentation master file, created by sphinx-quickstart on Mon May 23 16:22:52 2016. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. Momentum ResNets ================ Official library for using Momentum Residual Neural Networks [1]. These models extend any Residual architecture (for instance it also work with Transformers) to a larger class of deep learning models that consume less memory. They can be initialized with the same weights as a pretrained ResNet and are promising in fine-tuning applications. Installation ------------ To install ``momentumnet``, you first need to install its dependencies:: $ pip install numpy matplotlib torch Then install momentumnet:: $ pip install momentumnet If you do not have admin privileges on the computer, use the ``--user`` flag with `pip`. To upgrade, use the ``--upgrade`` flag provided by `pip`. To check if everything worked fine, you can do:: $ python -c 'import momentumnet' and it should not give any error message. Quickstart ---------- The main class is MomentumNet. It creates a Momentum ResNet that iterates .. math:: v_{t + 1} &= \gamma \times v_t + (1 - \gamma) \times f_t(x_t) \\ x_{t + 1} &= x_t + v_{t + 1} These forward equations can be reversed in closed-form. This enables backpropagation without standard memory consumption, since activations do not have to be stored. This process trades memory for computations. To get started, you can create a toy Momentum ResNet by specifying the functions f for the forward pass and the value of the momentum term, gamma. .. code:: python >>> from torch import nn >>> from momentumnet import MomentumNet >>> hidden = 8 >>> d = 500 >>> function = nn.Sequential(nn.Linear(d, hidden), nn.Tanh(), nn.Linear(hidden, d)) >>> mresnet = MomentumNet([function,] * 10, gamma=0.9) Momentum ResNets are a drop-in replacement for ResNets ------------------------------------------------------ We can transform a ResNet into a MomentumNet with the same parameters in two lines of codes. For instance, the following code instantiates a Momentum ResNet with weights of a pretrained Resnet-101 on ImageNet. We set "use_backprop" to False so that activations are not saved during the forward pass, allowing smaller memory consumption. .. code:: python >>> import torch >>> from momentumnet import transform_to_momentumnet >>> from torchvision.models import resnet101 >>> resnet = resnet101(pretrained=True) >>> mresnet101 = transform_to_momentumnet(resnet, gamma=0.9, use_backprop=False) Importantly, this method also works with Pytorch Transformers module, specifying the residual layers to be turned into their Momentum version. .. code:: python >>> import torch >>> from momentumnet import transform_to_momentumnet >>> transformer = torch.nn.Transformer(num_encoder_layers=6, num_decoder_layers=6) >>> mtransformer = transform_to_momentumnet(transformer, sub_layers=["encoder.layers", "decoder.layers"], >>> gamma=0.9, use_backprop=False, keep_first_layer=False) This initializes a Momentum Transformer with the same weights as the original Transformer. Memory savings when applying Momentum ResNets to Transformers ------------------------------------------------------------- Here is a short `tutorial `_ showing the memory gains using Momentum Transformers. Dependencies ------------ These are the dependencies to use momentumnet: * numpy (>=1.8) * matplotlib (>=1.3) * torch (>= 1.9) * memory_profiler * torchvision * vit_pytorch Bug reports ----------- Use the `github issue tracker `_ to report bugs. Cite ---- [1] Michael E. Sander, Pierre Ablin, Mathieu Blondel, Gabriel Peyre. Momentum Residual Neural Networks. Proceedings of the 38th International Conference on Machine Learning, PMLR 139:9276-9287 https://arxiv.org/abs/2102.07870