devitocodes
diff --git a/‎README.md‎
Lines changed: 7 additions & 1 deletion b/‎README.md‎
Lines changed: 7 additions & 1 deletion
diff --git a/‎examples/README.md‎
Lines changed: 7 additions & 0 deletions b/‎examples/README.md‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎examples/lenet_backward_pass.ipynb‎
Lines changed: 137 additions & 46 deletions b/‎examples/lenet_backward_pass.ipynb‎
Lines changed: 137 additions & 46 deletions
@@ -44,4 +44,10 @@ Done! You can now use Joey in your environment. If you want to make changes to t
 Joey is not available on PyPI yet.
 
 ## How to use
-The documentation is currently under construction. In the meantime, you can have a look at examples in `examples`.
+To start working with Joey, import the following packages:
+```
+import joey
+import joey.activation # If you want to use activation in neural network layers
+```
+
+Afterwards, you are free to use all functions Joey offers. The recommended way of getting started is going through examples inside the `examples` directory in this repository and looking at `__doc__` that is provided in every Joey class and public/abstract class method.
@@ -0,0 +1,7 @@
+## Joey examples
+In this directory, you can find Jupyter notebooks with explained step-by-step examples of using Joey. The recommended order of going through them is as follows:
+1. `lenet_forward_pass.ipynb`
+2. `lenet_backward_pass.ipynb`
+3. `lenet_training.ipynb`
+
+Enjoy!
@@ -1,5 +1,28 @@
 {
  "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Runinng a backward pass through LeNet using MNIST and Joey"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "In this notebook, we will construct LeNet using Joey and run a backward pass through it with some training data from MNIST.\n",
+ "\n",
+ "The aim of a backward pass is calculating gradients of all network parameters necessary for later weight updates done by a PyTorch optimizer. A backward pass follows a forward pass."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Firstly, let's import the required prerequisites:"
+ ]
+ },
  {
  "cell_type": "code",
  "execution_count": 1,
@@ -11,7 +34,17 @@
  "import torchvision.transforms as transforms\n",
  "import joey as ml\n",
  "import matplotlib.pyplot as plt\n",
- "import numpy as np"
+ "import numpy as np\n",
+ "import torch.nn as nn\n",
+ "import torch.nn.functional as F\n",
+ "import torch.optim as optim"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Then, let's define `imshow()` allowing us to look at the training data we'll use for the backward pass."
  ]
  },
  {
@@ -27,6 +60,13 @@
  " plt.show()"
  ]
  },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "In this particular example, every training batch will have 4 images."
+ ]
+ },
  {
  "cell_type": "code",
  "execution_count": 3,
@@ -36,6 +76,13 @@
  "batch_size = 4"
  ]
  },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Once we have `imshow()` and `batch_size` defined, we'll download the MNIST images using PyTorch."
+ ]
+ },
  {
  "cell_type": "code",
  "execution_count": 4,
@@ -53,6 +100,13 @@
  "dataiter = iter(trainloader)"
  ]
  },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "In our case, only one batch will be used for the backward pass. Joey accepts only NumPy arrays, so we have to convert PyTorch tensors to their NumPy equivalents first."
+ ]
+ },
  {
  "cell_type": "code",
  "execution_count": 5,
@@ -63,6 +117,13 @@
  "input_data = images.numpy()"
  ]
  },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "For reference, let's have a look at our training data. There are 4 images corresponding to the following digits: 5, 0, 4, 1."
+ ]
+ },
  {
  "cell_type": "code",
  "execution_count": 6,
@@ -85,6 +146,20 @@
  "imshow(torchvision.utils.make_grid(images))"
  ]
  },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "At this point, we're ready to define `backward_pass()` running the backward pass through Joey-constructed LeNet. We'll do so using the `Conv`, `MaxPooling`, `Flat`, `FullyConnected` and `FullyConnectedSoftmax` layer classes along with the `Net` class packing everything into one network we can interact with."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Note that a loss function has to be defined manually. Joey doesn't provide any built-in options here at the moment."
+ ]
+ },
  {
  "cell_type": "code",
  "execution_count": 7,
@@ -95,64 +170,66 @@
  " # Six 3x3 filters, activation RELU\n",
  " layer1 = ml.Conv(kernel_size=(6, 3, 3),\n",
  " input_size=(batch_size, 1, 32, 32),\n",
- " activation=ml.activation.ReLU(),\n",
- " generate_code=False)\n",
+ " activation=ml.activation.ReLU())\n",
  " # Max 2x2 subsampling\n",
  " layer2 = ml.MaxPooling(kernel_size=(2, 2),\n",
  " input_size=(batch_size, 6, 30, 30),\n",
- " stride=(2, 2),\n",
- " generate_code=False)\n",
+ " stride=(2, 2))\n",
  " # Sixteen 3x3 filters, activation RELU\n",
  " layer3 = ml.Conv(kernel_size=(16, 3, 3),\n",
  " input_size=(batch_size, 6, 15, 15),\n",
- " activation=ml.activation.ReLU(),\n",
- " generate_code=False)\n",
+ " activation=ml.activation.ReLU())\n",
  " # Max 2x2 subsampling\n",
  " layer4 = ml.MaxPooling(kernel_size=(2, 2),\n",
  " input_size=(batch_size, 16, 13, 13),\n",
  " stride=(2, 2),\n",
- " strict_stride_check=False,\n",
- " generate_code=False)\n",
+ " strict_stride_check=False)\n",
  " # Full connection (16 * 6 * 6 -> 120), activation RELU\n",
  " layer5 = ml.FullyConnected(weight_size=(120, 576),\n",
  " input_size=(576, batch_size),\n",
- " activation=ml.activation.ReLU(),\n",
- " generate_code=False)\n",
+ " activation=ml.activation.ReLU())\n",
  " # Full connection (120 -> 84), activation RELU\n",
  " layer6 = ml.FullyConnected(weight_size=(84, 120),\n",
  " input_size=(120, batch_size),\n",
- " activation=ml.activation.ReLU(),\n",
- " generate_code=False)\n",
+ " activation=ml.activation.ReLU())\n",
  " # Full connection (84 -> 10), output layer\n",
  " layer7 = ml.FullyConnectedSoftmax(weight_size=(10, 84),\n",
- " input_size=(84, batch_size),\n",
- " generate_code=False)\n",
+ " input_size=(84, batch_size))\n",
  " # Flattening layer necessary between layer 4 and 5\n",
- " layer_flat = ml.Flat(input_size=(batch_size, 16, 6, 6),\n",
- " generate_code=False)\n",
+ " layer_flat = ml.Flat(input_size=(batch_size, 16, 6, 6))\n",
  " \n",
  " layers = [layer1, layer2, layer3, layer4,\n",
  " layer_flat, layer5, layer6, layer7]\n",
  " \n",
  " net = ml.Net(layers)\n",
  " outputs = net.forward(input_data)\n",
  " \n",
- " def loss_grad(layer, b):\n",
+ " def loss_grad(layer, expected):\n",
  " gradients = []\n",
  " \n",
- " for i in range(10):\n",
- " result = layer.result.data[i, b]\n",
- " if i == expected_results[b]:\n",
- " result -= 1\n",
- " gradients.append(result)\n",
+ " for b in range(batch_size):\n",
+ " row = []\n",
+ " for i in range(10):\n",
+ " result = layer.result.data[i, b]\n",
+ " if i == expected[b]:\n",
+ " result -= 1\n",
+ " row.append(result)\n",
+ " gradients.append(row)\n",
  " \n",
  " return gradients\n",
  " \n",
- " net.backward(loss_grad)\n",
+ " net.backward(expected_results, loss_grad)\n",
  " \n",
  " return (layer1, layer2, layer3, layer4, layer_flat, layer5, layer6, layer7)"
  ]
  },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Afterwards, we're ready to run the backward pass."
+ ]
+ },
  {
  "cell_type": "code",
  "execution_count": 8,
@@ -167,9 +244,6 @@
  "/home/maksymilian/Desktop/UROP/devito/devito/types/grid.py:206: RuntimeWarning: divide by zero encountered in true_divide\n",
  " spacing = (np.array(self.extent) / (np.array(self.shape) - 1)).astype(self.dtype)\n",
  "Operator `Kernel` run in 0.01 s\n",
- "Operator `Kernel` run in 0.01 s\n",
- "Operator `Kernel` run in 0.01 s\n",
- "Operator `Kernel` run in 0.01 s\n",
  "Operator `Kernel` run in 0.01 s\n"
  ]
  }
@@ -182,23 +256,26 @@
  "cell_type": "markdown",
  "metadata": {},
  "source": [
- "PyTorch:"
+ "Results are stored in the `kernel_gradients` and `bias_gradients` properties of each layer (where applicable)."
  ]
  },
  {
- "cell_type": "code",
- "execution_count": 9,
+ "cell_type": "markdown",
  "metadata": {},
- "outputs": [],
  "source": [
- "import torch.nn as nn\n",
- "import torch.nn.functional as F\n",
- "import torch.optim as optim"
+ "In order to check the numerical correctness, we'll create the same network with PyTorch, run a backward pass through it using the same initial weights and data and compare the results with Joey's."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Here's the PyTorch code:"
  ]
  },
  {
  "cell_type": "code",
- "execution_count": 10,
+ "execution_count": 9,
  "metadata": {},
  "outputs": [],
  "source": [
@@ -230,7 +307,7 @@
  },
  {
  "cell_type": "code",
- "execution_count": 11,
+ "execution_count": 10,
  "metadata": {},
  "outputs": [],
  "source": [
@@ -252,7 +329,7 @@
  },
  {
  "cell_type": "code",
- "execution_count": 12,
+ "execution_count": 11,
  "metadata": {},
  "outputs": [],
  "source": [
@@ -263,31 +340,38 @@
  "loss.backward()"
  ]
  },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "After running the backward pass in PyTorch, we're ready to make comparisons. Let's calculate relative errors between Joey and PyTorch in terms of weight/bias gradients."
+ ]
+ },
  {
  "cell_type": "code",
- "execution_count": 13,
+ "execution_count": 12,
  "metadata": {},
  "outputs": [
  {
  "name": "stdout",
  "output_type": "stream",
  "text": [
- "layers[0] maximum relative error: 1.599673499123359e-14\n",
- "layers[1] maximum relative error: 5.710234136667345e-12\n",
- "layers[2] maximum relative error: 1.9638017195468526e-11\n",
- "layers[3] maximum relative error: 1.8676488586249282e-11\n",
- "layers[4] maximum relative error: 3.4692340371450744e-13\n",
+ "layers[0] maximum relative error: 1.4935025269750558e-14\n",
+ "layers[1] maximum relative error: 1.0457210947850931e-13\n",
+ "layers[2] maximum relative error: 3.0920027811804816e-12\n",
+ "layers[3] maximum relative error: 2.615895862310905e-13\n",
+ "layers[4] maximum relative error: 1.4951643318957554e-12\n",
  "\n",
- "Maximum relative error is in layers[2]: 1.9638017195468526e-11\n"
+ "Maximum relative error is in layers[2]: 3.0920027811804816e-12\n"
  ]
  },
  {
  "name": "stderr",
  "output_type": "stream",
  "text": [
- "<ipython-input-13-c5fd7a032cbe>:11: RuntimeWarning: invalid value encountered in true_divide\n",
+ "<ipython-input-12-c5fd7a032cbe>:11: RuntimeWarning: invalid value encountered in true_divide\n",
  " kernel_error = abs(kernel_grad - pytorch_kernel_grad) / abs(pytorch_kernel_grad)\n",
- "<ipython-input-13-c5fd7a032cbe>:16: RuntimeWarning: invalid value encountered in true_divide\n",
+ "<ipython-input-12-c5fd7a032cbe>:16: RuntimeWarning: invalid value encountered in true_divide\n",
  " bias_error = abs(bias_grad - pytorch_bias_grad) / abs(pytorch_bias_grad)\n"
  ]
  }
@@ -320,6 +404,13 @@
  "print()\n",
  "print('Maximum relative error is in layers[' + str(index) + ']: ' + str(max_error))"
  ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "As we can see, the maximum error is low enough (given floating-point calculation accuracy and the complexity of our network) for Joey's results to be considered correct."
+ ]
  }
  ],
  "metadata": {