Note
Go to the end to download the full example code.
Creating Extensions Using NumPy and SciPy#
Created On: Mar 24, 2017 | Last Updated: Apr 25, 2023 | Last Verified: Not Verified
Author: Adam Paszke
Updated by: Adam Dziedzic
In this tutorial, we shall go through two tasks:
Create a neural network layer with no parameters.
This calls into numpy as part of its implementation
Create a neural network layer that has learnable weights
This calls into SciPy as part of its implementation
import torch from torch.autograd import Function
Parameter-less example#
This layer doesn’t particularly do anything useful or mathematically correct.
It is aptly named BadFFTFunction
Layer Implementation
from numpy.fft import rfft2, irfft2 class BadFFTFunction(Function): @staticmethod def forward(ctx, input): numpy_input = input.detach().numpy() result = abs(rfft2(numpy_input)) return input.new(result) @staticmethod def backward(ctx, grad_output): numpy_go = grad_output.numpy() result = irfft2(numpy_go) return grad_output.new(result) # since this layer does not have any parameters, we can # simply declare this as a function, rather than as an ``nn.Module`` class def incorrect_fft(input): return BadFFTFunction.apply(input)
Example usage of the created layer:
input = torch.randn(8, 8, requires_grad=True) result = incorrect_fft(input) print(result) result.backward(torch.randn(result.size())) print(input)
tensor([[ 2.2466, 10.4746, 13.1438, 3.4380, 18.6858], [14.9922, 7.9807, 8.2623, 3.7835, 1.4554], [10.4887, 3.7699, 7.9599, 5.5258, 5.8373], [ 6.1822, 5.9049, 5.7100, 5.0972, 4.6942], [ 9.0984, 8.0464, 15.1463, 6.9384, 4.4560], [ 6.1822, 6.4181, 7.9228, 9.7651, 4.6942], [10.4887, 12.7713, 2.3015, 7.4872, 5.8373], [14.9922, 11.0545, 6.2261, 10.5147, 1.4554]], grad_fn=<BadFFTFunctionBackward>) tensor([[ 0.2952, -0.0257, -0.0560, -1.7654, -0.8212, 1.1353, 0.4140, 2.2008], [ 0.5644, 1.8047, 0.7646, 0.3231, -2.3413, -0.5453, -0.7018, 0.5297], [-0.5795, 0.9919, -0.0034, 0.3612, -0.5826, -1.0429, -0.0981, 1.0726], [-0.9961, 0.3840, 2.4540, 2.1218, -1.3649, 1.3814, 1.7457, 1.2876], [ 0.8588, 0.9248, -0.3694, 0.2159, -0.1203, -0.0416, 0.8198, -0.5190], [-2.2435, 1.1805, 1.2026, 0.6100, -0.6527, 0.2635, 0.1559, -0.0515], [-0.8078, 0.4729, -2.1454, 0.5390, -0.3716, -1.6936, -1.7029, -0.9817], [-1.0514, -0.2043, 0.8171, -0.8624, -0.4560, -0.1349, -0.8457, 0.5339]], requires_grad=True)
Parametrized example#
In deep learning literature, this layer is confusingly referred to as convolution while the actual operation is cross-correlation (the only difference is that filter is flipped for convolution, which is not the case for cross-correlation).
Implementation of a layer with learnable weights, where cross-correlation has a filter (kernel) that represents weights.
The backward pass computes the gradient wrt
the input and the gradient wrt
the filter.
from numpy import flip import numpy as np from scipy.signal import convolve2d, correlate2d from torch.nn.modules.module import Module from torch.nn.parameter import Parameter class ScipyConv2dFunction(Function): @staticmethod def forward(ctx, input, filter, bias): # detach so we can cast to NumPy input, filter, bias = input.detach(), filter.detach(), bias.detach() result = correlate2d(input.numpy(), filter.numpy(), mode='valid') result += bias.numpy() ctx.save_for_backward(input, filter, bias) return torch.as_tensor(result, dtype=input.dtype) @staticmethod def backward(ctx, grad_output): grad_output = grad_output.detach() input, filter, bias = ctx.saved_tensors grad_output = grad_output.numpy() grad_bias = np.sum(grad_output, keepdims=True) grad_input = convolve2d(grad_output, filter.numpy(), mode='full') # the previous line can be expressed equivalently as: # grad_input = correlate2d(grad_output, flip(flip(filter.numpy(), axis=0), axis=1), mode='full') grad_filter = correlate2d(input.numpy(), grad_output, mode='valid') return torch.from_numpy(grad_input), torch.from_numpy(grad_filter).to(torch.float), torch.from_numpy(grad_bias).to(torch.float) class ScipyConv2d(Module): def __init__(self, filter_width, filter_height): super(ScipyConv2d, self).__init__() self.filter = Parameter(torch.randn(filter_width, filter_height)) self.bias = Parameter(torch.randn(1, 1)) def forward(self, input): return ScipyConv2dFunction.apply(input, self.filter, self.bias)
Example usage:
module = ScipyConv2d(3, 3) print("Filter and bias: ", list(module.parameters())) input = torch.randn(10, 10, requires_grad=True) output = module(input) print("Output from the convolution: ", output) output.backward(torch.randn(8, 8)) print("Gradient for the input map: ", input.grad)
Filter and bias: [Parameter containing: tensor([[ 0.3356, -0.2327, -0.5154], [-1.2979, 0.0811, 0.8031], [-0.7731, 1.3722, 0.3721]], requires_grad=True), Parameter containing: tensor([[0.7548]], requires_grad=True)] Output from the convolution: tensor([[ 3.8724, -3.1303, 2.4928, -4.3305, 1.2250, -1.1443, 3.1034, 3.4513], [ 1.5847, 2.1168, -1.5623, -0.9969, 2.9442, 3.7232, 2.3677, -0.7233], [ 3.0246, 4.2500, -5.4842, 3.3440, 3.3938, -4.3906, 0.1493, 3.9273], [ 7.8861, 1.0760, -1.2524, 2.4386, -1.7296, -0.4153, 5.7283, 0.1295], [ 4.6396, -1.6610, 0.0982, -0.5314, 2.7959, 2.5278, -2.4304, 1.7312], [-4.8672, -0.5916, 4.5925, 0.4643, 0.6764, -2.0775, -0.9738, 3.7092], [-0.0744, 2.7289, 3.8136, -0.0533, -3.3746, 4.5985, -0.1943, -0.4169], [ 2.8326, -0.5647, 1.5854, 0.9507, 0.9085, 1.4550, -1.0737, 2.1935]], grad_fn=<ScipyConv2dFunctionBackward>) Gradient for the input map: tensor([[-0.4730, 0.1483, 0.7877, -0.4039, 0.7756, 0.8479, -0.1923, 0.4113, -0.0956, -0.2775], [ 1.7915, 0.6298, -1.0065, 2.7036, -0.7477, -1.7302, 0.2486, -1.1284, -0.5987, 0.0786], [ 1.0611, -1.1547, -0.7497, -0.6121, -4.8190, 1.0476, -0.9747, -0.5249, 2.4994, 0.6504], [ 0.0158, -0.9742, 0.9101, 0.8544, 1.5960, 0.4470, 2.3892, 0.3831, -1.1489, -0.0717], [ 3.0282, -1.7070, 1.4324, -0.7906, -2.0520, 0.2697, -0.1487, -4.5218, -0.1168, 0.2546], [ 3.0389, -3.4158, -2.9829, -3.5110, 2.3249, 3.0555, -1.5636, -1.3264, 1.7514, 0.7459], [ 0.3872, -3.0506, 1.3091, 0.8878, 0.4753, -0.9791, -0.2957, -0.3815, 1.6542, 1.1433], [-3.5628, -1.5042, 5.0361, -2.3516, 0.3800, -0.9906, -0.6702, -0.0898, 3.0828, 0.9571], [-4.0500, 0.3648, 4.7979, 1.2016, 0.2933, 2.2379, 1.2952, -0.0774, -1.6260, -0.7628], [-1.1984, 0.8443, 2.9538, 0.2789, -0.0768, 1.8323, -0.2185, -2.9125, -2.2107, -0.3709]])
Check the gradients:
from torch.autograd.gradcheck import gradcheck moduleConv = ScipyConv2d(3, 3) input = [torch.randn(20, 20, dtype=torch.double, requires_grad=True)] test = gradcheck(moduleConv, input, eps=1e-6, atol=1e-4) print("Are the gradients correct: ", test)
Are the gradients correct: True
Total running time of the script: (0 minutes 0.606 seconds)