{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "provenance": [], "authorship_tag": "ABX9TyORZF8xy4X1yf4oRhRq8Rtm", "include_colab_link": true }, "kernelspec": { "name": "python3", "display_name": "Python 3" }, "language_info": { "name": "python" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "source": [ "# **Notebook 10.5: Convolution for MNIST**\n", "\n", "This notebook builds a proper network for 2D convolution. It works with the MNIST dataset (figure 15.15a), which was the original classic dataset for classifying images. The network will take a 28x28 grayscale image and classify it into one of 10 classes representing a digit.\n", "\n", "The code is adapted from https://nextjournal.com/gkoehler/pytorch-mnist\n", "\n", "Work through the cells below, running each cell in turn. In various places you will see the words \"TO DO\". Follow the instructions at these places and make predictions about what is going to happen or write code to complete the functions.\n", "\n", "Contact me at udlbookmail@gmail.com if you find any mistakes or have any suggestions.\n" ], "metadata": { "id": "t9vk9Elugvmi" } }, { "cell_type": "code", "source": [ "import torch\n", "import torchvision\n", "import torch.nn as nn\n", "import torch.nn.functional as F\n", "import torch.optim as optim\n", "import matplotlib.pyplot as plt\n", "import random" ], "metadata": { "id": "YrXWAH7sUWvU" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "# Run this once to load the train and test data straight into a dataloader class\n", "# that will provide the batches\n", "\n", "# (It may complain that some files are missing because the files seem to have been\n", "# reorganized on the underlying website, but it still seems to work). If everything is working\n", "# properly, then the whole notebook should run to the end without further problems\n", "# even before you make changes.\n", "batch_size_train = 64\n", "batch_size_test = 1000\n", "\n", "# TODO Change this directory to point towards an existing directory\n", "myDir = '/files/'\n", "\n", "train_loader = torch.utils.data.DataLoader(\n", " torchvision.datasets.MNIST(myDir, train=True, download=True,\n", " transform=torchvision.transforms.Compose([\n", " torchvision.transforms.ToTensor(),\n", " torchvision.transforms.Normalize(\n", " (0.1307,), (0.3081,))\n", " ])),\n", " batch_size=batch_size_train, shuffle=True)\n", "\n", "test_loader = torch.utils.data.DataLoader(\n", " torchvision.datasets.MNIST(myDir, train=False, download=True,\n", " transform=torchvision.transforms.Compose([\n", " torchvision.transforms.ToTensor(),\n", " torchvision.transforms.Normalize(\n", " (0.1307,), (0.3081,))\n", " ])),\n", " batch_size=batch_size_test, shuffle=True)" ], "metadata": { "id": "wScBGXXFVadm" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "# Let's draw some of the training data\n", "examples = enumerate(test_loader)\n", "batch_idx, (example_data, example_targets) = next(examples)\n", "\n", "fig = plt.figure()\n", "for i in range(6):\n", " plt.subplot(2,3,i+1)\n", " plt.tight_layout()\n", " plt.imshow(example_data[i][0], cmap='gray', interpolation='none')\n", " plt.title(\"Ground Truth: {}\".format(example_targets[i]))\n", " plt.xticks([])\n", " plt.yticks([])\n", "plt.show()" ], "metadata": { "id": "8bKADvLHbiV5" }, "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "source": [ "Define the network. This is a more typical way to define a network than the sequential structure. We define a class for the network, and define the parameters in the constructor. Then we use a function called forward to actually run the network. It's easy to see how you might use residual connections in this format." ], "metadata": { "id": "_sFvRDGrl4qe" } }, { "cell_type": "code", "source": [ "from os import X_OK\n", "# TODO Change this class to implement\n", "# 1. A valid convolution with kernel size 5, 1 input channel and 10 output channels\n", "# 2. A max pooling operation over a 2x2 area\n", "# 3. A Relu\n", "# 4. A valid convolution with kernel size 5, 10 input channels and 20 output channels\n", "# 5. A 2D Dropout layer\n", "# 6. A max pooling operation over a 2x2 area\n", "# 7. A relu\n", "# 8. A flattening operation\n", "# 9. A fully connected layer mapping from (whatever dimensions we are at-- find out using .shape) to 50\n", "# 10. A ReLU\n", "# 11. A fully connected layer mapping from 50 to 10 dimensions\n", "# 12. A softmax function.\n", "\n", "# Replace this class which implements a minimal network (which still does okay)\n", "class Net(nn.Module):\n", " def __init__(self):\n", " super(Net, self).__init__()\n", " # Valid convolution, 1 channel in, 2 channels out, stride 1, kernel size = 3\n", " self.conv1 = nn.Conv2d(1, 2, kernel_size=3)\n", " # Dropout for convolutions\n", " self.drop = nn.Dropout2d()\n", " # Fully connected layer\n", " self.fc1 = nn.Linear(338, 10)\n", "\n", " def forward(self, x):\n", " x = self.conv1(x)\n", " x = self.drop(x)\n", " x = F.max_pool2d(x,2)\n", " x = F.relu(x)\n", " x = x.flatten(1)\n", " x = self.fc1(x)\n", " x = F.log_softmax(x)\n", " return x\n", "\n", "\n", "\n", "\n" ], "metadata": { "id": "EQkvw2KOPVl7" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "# He initialization of weights\n", "def weights_init(layer_in):\n", " if isinstance(layer_in, nn.Linear):\n", " nn.init.kaiming_uniform_(layer_in.weight)\n", " layer_in.bias.data.fill_(0.0)" ], "metadata": { "id": "qWZtkCZcU_dg" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "# Create network\n", "model = Net()\n", "# Initialize model weights\n", "model.apply(weights_init)\n", "# Define optimizer\n", "optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)" ], "metadata": { "id": "FslroPJJffrh" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "# Main training routine\n", "def train(epoch):\n", " model.train()\n", " # Get each\n", " for batch_idx, (data, target) in enumerate(train_loader):\n", " optimizer.zero_grad()\n", " output = model(data)\n", " loss = F.nll_loss(output, target)\n", " loss.backward()\n", " optimizer.step()\n", " # Store results\n", " if batch_idx % 10 == 0:\n", " print('Train Epoch: {} [{}/{}]\\tLoss: {:.6f}'.format(\n", " epoch, batch_idx * len(data), len(train_loader.dataset), loss.item()))" ], "metadata": { "id": "xKQd9PzkQ766" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "# Run on test data\n", "def test():\n", " model.eval()\n", " test_loss = 0\n", " correct = 0\n", " with torch.no_grad():\n", " for data, target in test_loader:\n", " output = model(data)\n", " test_loss += F.nll_loss(output, target, size_average=False).item()\n", " pred = output.data.max(1, keepdim=True)[1]\n", " correct += pred.eq(target.data.view_as(pred)).sum()\n", " test_loss /= len(test_loader.dataset)\n", " print('\\nTest set: Avg. loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\\n'.format(\n", " test_loss, correct, len(test_loader.dataset),\n", " 100. * correct / len(test_loader.dataset)))" ], "metadata": { "id": "Byn-f7qWRLxX" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "# Get initial performance\n", "test()\n", "# Train for three epochs\n", "n_epochs = 3\n", "for epoch in range(1, n_epochs + 1):\n", " train(epoch)\n", " test()" ], "metadata": { "id": "YgLaex1pfhqz" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "# Run network on data we got before and show predictions\n", "output = model(example_data)\n", "\n", "fig = plt.figure()\n", "for i in range(10):\n", " plt.subplot(5,5,i+1)\n", " plt.tight_layout()\n", " plt.imshow(example_data[i][0], cmap='gray', interpolation='none')\n", " plt.title(\"Prediction: {}\".format(\n", " output.data.max(1, keepdim=True)[1][i].item()))\n", " plt.xticks([])\n", " plt.yticks([])\n", "plt.show()" ], "metadata": { "id": "o7fRUAy9Se1B" }, "execution_count": null, "outputs": [] } ] }