udlbook/Notebooks/Chap03/3_3_Shallow_Network_Regions.ipynb

{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "provenance": [],
      "authorship_tag": "ABX9TyNioITtfAcfxEfM3UOfQyb9",
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/udlbook/udlbook/blob/main/Notebooks/Chap03/3_3_Shallow_Network_Regions.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "# **Notebook 3.3 -- Shallow network regions**\n",
        "\n",
        "The purpose of this notebook is to compute the maximum possible number of linear regions as seen in figure 3.9 of the book.\n",
        "\n",
        "Work through the cells below, running each cell in turn. In various places you will see the words \"TO DO\". Follow the instructions at these places and write code to complete the functions. There are also questions interspersed in the text.\n",
        "\n",
        "Contact me at udlbookmail@gmail.com if you find any mistakes or have any suggestions."
      ],
      "metadata": {
        "id": "DCTC8fQ6cp-n"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Imports math library\n",
        "import numpy as np\n",
        "# Imports plotting library\n",
        "import matplotlib.pyplot as plt\n",
        "# Imports math libray\n",
        "import math"
      ],
      "metadata": {
        "id": "W3C1ZA1gcpq_"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "The number of regions $N$ created by a shallow neural network with $D_i$ inputs and $D$ hidden units is given by Zaslavsky's formula:\n",
        "\n",
        "\\begin{equation}N = \\sum_{j=0}^{D_{i}}\\binom{D}{j}=\\sum_{j=0}^{D_{i}} \\frac{D!}{(D-j)!j!} \\end{equation} <br>\n",
        "\n"
      ],
      "metadata": {
        "id": "TbfanfXBe84L"
      }
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "4UQ2n0RWcgOb"
      },
      "outputs": [],
      "source": [
        "def number_regions(Di, D):\n",
        "  # TODO -- implement Zaslavsky's formula\n",
        "  # You can use math.comb() https://www.w3schools.com/python/ref_math_comb.asp\n",
        "  # Replace this code\n",
        "  N = 1;\n",
        "\n",
        "  return N"
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "# Calculate the number of regions for 2D input (Di=2) and 3 hidden units (D=3) as in figure 3.8j\n",
        "N = number_regions(2, 3)\n",
        "print(f\"Di=2, D=3, Number of regions = {int(N)}, True value = 7\")"
      ],
      "metadata": {
        "id": "AqSUfuJDigN9"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Calculate the number of regions for 10D input (Di=2) and 50 hidden units (D=50)\n",
        "N = number_regions(10, 50)\n",
        "print(f\"Di=10, D=50, Number of regions = {int(N)}, True value = 13432735556\")"
      ],
      "metadata": {
        "id": "krNKPV9gjCu-"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "This works but there is a complication. If the number of hidden units $D$ is fewer than the number of input dimensions $D_i$ , the formula will fail.  When this is the case, there are just $2^D$ regions (see figure 3.10 to understand why).\n",
        "\n",
        "Let's demonstrate this:"
      ],
      "metadata": {
        "id": "rk1a2LqGkO9u"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Show that calculation fails when $D_i < D$\n",
        "try:\n",
        "  N = number_regions(10, 8)\n",
        "  print(f\"Di=10, D=8, Number of regions = {int(N)}, True value = 256\")\n",
        "except Exception as error:\n",
        "    print(\"An exception occurred:\", error)\n"
      ],
      "metadata": {
        "id": "uq5IeAZTkIMg"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Let's do the calculation properly when D<Di (see figure 3.10 from the book)\n",
        "D = 8; Di = 10\n",
        "N = np.power(2,D)\n",
        "# We can equivalently do this by calling number_regions with the D twice\n",
        "# Think about why this works\n",
        "N2 = number_regions (D,D)\n",
        "print(f\"Di=10, D=8, Number of regions = {int(N)}, Number of regions = {int(N2)}, True value = 256\")"
      ],
      "metadata": {
        "id": "Ig8Kg_ADjoQd"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Now let's plot the graph from figure 3.9a\n",
        "dims = np.array([1,5,10,50,100])\n",
        "regions = np.zeros((dims.shape[0], 1000))\n",
        "for c_dim in range(dims.shape[0]):\n",
        "    D_i = dims[c_dim]\n",
        "    print (f\"Counting regions for {D_i} input dimensions\")\n",
        "    for D in range(1000):\n",
        "        regions[c_dim, D] = number_regions(np.min([D_i,D]), D)\n",
        "\n",
        "fig, ax = plt.subplots()\n",
        "ax.semilogy(regions[0,:],'k-')\n",
        "ax.semilogy(regions[1,:],'b-')\n",
        "ax.semilogy(regions[2,:],'m-')\n",
        "ax.semilogy(regions[3,:],'c-')\n",
        "ax.semilogy(regions[4,:],'y-')\n",
        "ax.legend(['$D_i$=1', '$D_i$=5', '$D_i$=10', '$D_i$=50', '$D_i$=100'])\n",
        "ax.set_xlabel(\"Number of hidden units, D\")\n",
        "ax.set_ylabel(\"Number of regions, N\")\n",
        "plt.xlim([0,1000])\n",
        "plt.ylim([1e1,1e150])\n",
        "plt.show()"
      ],
      "metadata": {
        "id": "5XnEOp0Bj_QK"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Now let's compute and plot the number of regions as a function of the number of parameters as in figure 3.9b\n",
        "# First let's write a function that computes the number of parameters as a function of the input dimension and number of hidden units (assuming just one output)\n",
        "\n",
        "def number_parameters(D_i, D):\n",
        "  # TODO -- replace this code with the proper calculation\n",
        "  N = 1\n",
        "\n",
        "  return N ;"
      ],
      "metadata": {
        "id": "Pav1OsCnpm6P"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Now let's test the code\n",
        "N = number_parameters(10, 8)\n",
        "print(f\"Di=10, D=8, Number of parameters = {int(N)}, True value = 97\")"
      ],
      "metadata": {
        "id": "VbhDmZ1gwkQj"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Now let's plot the graph from figure 3.9a (takes ~1min)\n",
        "dims = np.array([1,5,10,50,100])\n",
        "regions = np.zeros((dims.shape[0], 200))\n",
        "params = np.zeros((dims.shape[0], 200))\n",
        "\n",
        "# We'll compute the five lines separately this time to make it faster\n",
        "for c_dim in range(dims.shape[0]):\n",
        "    D_i = dims[c_dim]\n",
        "    print (f\"Counting regions for {D_i} input dimensions\")\n",
        "    for c_hidden in range(1, 200):\n",
        "        # Iterate over different ranges of number hidden variables for different input sizes\n",
        "        D = int(c_hidden * 500 / D_i)\n",
        "        params[c_dim, c_hidden] =  D_i * D +D + D +1\n",
        "        regions[c_dim, c_hidden] = number_regions(np.min([D_i,D]), D)\n",
        "\n",
        "fig, ax = plt.subplots()\n",
        "ax.semilogy(params[0,:], regions[0,:],'k-')\n",
        "ax.semilogy(params[1,:], regions[1,:],'b-')\n",
        "ax.semilogy(params[2,:], regions[2,:],'m-')\n",
        "ax.semilogy(params[3,:], regions[3,:],'c-')\n",
        "ax.semilogy(params[4,:], regions[4,:],'y-')\n",
        "ax.legend(['$D_i$=1', '$D_i$=5', '$D_i$=10', '$D_i$=50', '$D_i$=100'])\n",
        "ax.set_xlabel(\"Number of parameters, D\")\n",
        "ax.set_ylabel(\"Number of regions, N\")\n",
        "plt.xlim([0,100000])\n",
        "plt.ylim([1e1,1e150])\n",
        "plt.show()\n"
      ],
      "metadata": {
        "id": "AH4nA50Au8-a"
      },
      "execution_count": null,
      "outputs": []
    }
  ]
}