Add files via upload

2024-01-02 12:06:41 -05:00
parent d2f885db37
commit 707f93daae
2 changed files with 230 additions and 215 deletions
@@ -1,33 +1,22 @@
 {
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "provenance": [],
      "authorship_tag": "ABX9TyOmxhh3ymYWX+1HdZ91I6zU",
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "cells": [
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
-        "id": "view-in-github",
+        "colab_type": "text",
-        "colab_type": "text"
+        "id": "view-in-github"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/udlbook/udlbook/blob/main/Notebooks/Chap03/3_4_Activation_Functions.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "Mn0F56yY8ohX"
      },
      "source": [
        "# **Notebook 3.4 -- Activation functions**\n",
        "\n",
@@ -36,10 +25,7 @@
        "Work through the cells below, running each cell in turn. In various places you will see the words \"TO DO\". Follow the instructions at these places and write code to complete the functions. There are also questions interspersed in the text.\n",
        "\n",
        "Contact me at udlbookmail@gmail.com if you find any mistakes or have any suggestions."
-      ],
+      ]
      "metadata": {
        "id": "Mn0F56yY8ohX"
      }
    },
    {
      "cell_type": "code",
@@ -57,6 +43,11 @@
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "AeHzflFt9Tgn"
      },
      "outputs": [],
      "source": [
        "# Plot the shallow neural network.  We'll assume input in is range [0,1] and output [-1,1]\n",
        "# If the plot_all flag is set to true, then we'll plot all the intermediate stages as in Figure 3.3\n",
@@ -94,15 +85,15 @@
        "    for i in range(len(x_data)):\n",
        "      ax.plot(x_data[i], y_data[i],)\n",
        "  plt.show()"
-      ],
+      ]
      "metadata": {
        "id": "AeHzflFt9Tgn"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "7qeIUrh19AkH"
      },
      "outputs": [],
      "source": [
        "# Define a shallow neural network with, one input, one output, and three hidden units\n",
        "def shallow_1_1_3(x, activation_fn, phi_0,phi_1,phi_2,phi_3, theta_10, theta_11, theta_20, theta_21, theta_30, theta_31):\n",
@@ -123,38 +114,39 @@
        "\n",
        "  # Return everything we have calculated\n",
        "  return y, pre_1, pre_2, pre_3, act_1, act_2, act_3, w_act_1, w_act_2, w_act_3"
-      ],
+      ]
      "metadata": {
        "id": "7qeIUrh19AkH"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "cwTp__Fk9YUx"
      },
      "outputs": [],
      "source": [
        "# Define the Rectified Linear Unit (ReLU) function\n",
        "def ReLU(preactivation):\n",
        "  activation = preactivation.clip(0.0)\n",
        "  return activation"
-      ],
+      ]
      "metadata": {
        "id": "cwTp__Fk9YUx"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "source": [
        "First, let's run the network with a ReLU functions"
      ],
      "metadata": {
        "id": "INQkRzyn9kVC"
-      }
+      },
      "source": [
        "First, let's run the network with a ReLU functions"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "jT9QuKou9i0_"
      },
      "outputs": [],
      "source": [
        "# Now lets define some parameters and run the neural network\n",
        "theta_10 =  0.3 ; theta_11 = -1.0\n",
@@ -170,15 +162,14 @@
        "    shallow_1_1_3(x, ReLU, phi_0,phi_1,phi_2,phi_3, theta_10, theta_11, theta_20, theta_21, theta_30, theta_31)\n",
        "# And then plot it\n",
        "plot_neural(x, y, pre_1, pre_2, pre_3, act_1, act_2, act_3, w_act_1, w_act_2, w_act_3, plot_all=True)"
-      ],
+      ]
      "metadata": {
        "id": "jT9QuKou9i0_"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "-I8N7r1o9HYf"
      },
      "source": [
        "# Sigmoid activation function\n",
        "\n",
@@ -189,13 +180,15 @@
        "\\end{equation}\n",
        "\n",
        "(Note that the factor of 10 is not standard -- but it allow us to plot on the same axes as the ReLU examples)"
-      ],
+      ]
      "metadata": {
        "id": "-I8N7r1o9HYf"
      }
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "hgkioNyr975Y"
      },
      "outputs": [],
      "source": [
        "# Define the sigmoid function\n",
        "def sigmoid(preactivation):\n",
@@ -204,15 +197,15 @@
        "  activation = np.zeros_like(preactivation);\n",
        "\n",
        "  return activation"
-      ],
+      ]
      "metadata": {
        "id": "hgkioNyr975Y"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "94HIXKJH97ve"
      },
      "outputs": [],
      "source": [
        "# Make an array of inputs\n",
        "z = np.arange(-1,1,0.01)\n",
@@ -224,24 +217,25 @@
        "ax.set_xlim([-1,1]);ax.set_ylim([0,1])\n",
        "ax.set_xlabel('z'); ax.set_ylabel('sig[z]')\n",
        "plt.show()"
-      ],
+      ]
      "metadata": {
        "id": "94HIXKJH97ve"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "source": [
        "Let's see what happens when we use this activation function in a neural network"
      ],
      "metadata": {
        "id": "p3zQNXhj-J-o"
-      }
+      },
      "source": [
        "Let's see what happens when we use this activation function in a neural network"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "C1dASr9L-GNt"
      },
      "outputs": [],
      "source": [
        "theta_10 =  0.3 ; theta_11 = -1.0\n",
        "theta_20 = -1.0  ; theta_21 = 2.0\n",
@@ -256,39 +250,41 @@
        "    shallow_1_1_3(x, sigmoid, phi_0,phi_1,phi_2,phi_3, theta_10, theta_11, theta_20, theta_21, theta_30, theta_31)\n",
        "# And then plot it\n",
        "plot_neural(x, y, pre_1, pre_2, pre_3, act_1, act_2, act_3, w_act_1, w_act_2, w_act_3, plot_all=True)"
-      ],
+      ]
      "metadata": {
        "id": "C1dASr9L-GNt"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "source": [
        "You probably notice that this gives nice smooth curves.  So why don't we use this?  Aha... it's not obvious right now, but we will get to it when we learn to fit models."
      ],
      "metadata": {
        "id": "Uuam_DewA9fH"
-      }
+      },
      "source": [
        "You probably notice that this gives nice smooth curves.  So why don't we use this?  Aha... it's not obvious right now, but we will get to it when we learn to fit models."
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "C9WKkcMUABze"
      },
      "source": [
        "# Heaviside activation function\n",
        "\n",
        "The Heaviside function is defined as:\n",
        "\n",
        "\\begin{equation}\n",
-        "\\mbox{heaviside}[z] = \\begin{cases} 0 & \\quad z <0 \\\\ 1 & \\quad z\\geq 0\\end{cases}\n",
+        "\\text{heaviside}[z] = \\begin{cases} 0 & \\quad z <0 \\\\ 1 & \\quad z\\geq 0\\end{cases}\n",
        "\\end{equation}"
-      ],
+      ]
      "metadata": {
        "id": "C9WKkcMUABze"
      }
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "-1qFkdOL-NPc"
      },
      "outputs": [],
      "source": [
        "# Define the heaviside function\n",
        "def heaviside(preactivation):\n",
@@ -299,15 +295,15 @@
        "\n",
        "\n",
        "  return activation"
-      ],
+      ]
      "metadata": {
        "id": "-1qFkdOL-NPc"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "mSPyp7iA-44H"
      },
      "outputs": [],
      "source": [
        "# Make an array of inputs\n",
        "z = np.arange(-1,1,0.01)\n",
@@ -319,15 +315,15 @@
        "ax.set_xlim([-1,1]);ax.set_ylim([-2,2])\n",
        "ax.set_xlabel('z'); ax.set_ylabel('heaviside[z]')\n",
        "plt.show()"
-      ],
+      ]
      "metadata": {
        "id": "mSPyp7iA-44H"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "t99K2lSl--Mq"
      },
      "outputs": [],
      "source": [
        "theta_10 =  0.3 ; theta_11 = -1.0\n",
        "theta_20 = -1.0  ; theta_21 = 2.0\n",
@@ -342,39 +338,41 @@
        "    shallow_1_1_3(x, heaviside, phi_0,phi_1,phi_2,phi_3, theta_10, theta_11, theta_20, theta_21, theta_30, theta_31)\n",
        "# And then plot it\n",
        "plot_neural(x, y, pre_1, pre_2, pre_3, act_1, act_2, act_3, w_act_1, w_act_2, w_act_3, plot_all=True)"
-      ],
+      ]
      "metadata": {
        "id": "t99K2lSl--Mq"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "source": [
        "This can approximate any function, but the output is discontinuous, and there are also reasons not to use it that we will discover when we learn more about model fitting."
      ],
      "metadata": {
        "id": "T65MRtM-BCQA"
-      }
+      },
      "source": [
        "This can approximate any function, but the output is discontinuous, and there are also reasons not to use it that we will discover when we learn more about model fitting."
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "RkB-XZMLBTaR"
      },
      "source": [
        "# Linear activation functions\n",
        "\n",
        "Neural networks don't work if the activation function is linear.  For example, consider what would happen if the activation function was:\n",
        "\n",
        "\\begin{equation}\n",
-        "\\mbox{lin}[z] = a + bz\n",
+        "\\text{lin}[z] = a + bz\n",
        "\\end{equation}"
-      ],
+      ]
      "metadata": {
        "id": "RkB-XZMLBTaR"
      }
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Q59v3saj_jq1"
      },
      "outputs": [],
      "source": [
        "# Define the linear activation function\n",
        "def lin(preactivation):\n",
@@ -384,15 +382,15 @@
        "  activation = a+b * preactivation\n",
        "  # Return\n",
        "  return activation"
-      ],
+      ]
      "metadata": {
        "id": "Q59v3saj_jq1"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "IwodsBr0BkDn"
      },
      "outputs": [],
      "source": [
        "# TODO\n",
        "# 1. The linear activation function above just returns the input: (0+1*z) = z\n",
@@ -415,12 +413,23 @@
        "    shallow_1_1_3(x, lin, phi_0,phi_1,phi_2,phi_3, theta_10, theta_11, theta_20, theta_21, theta_30, theta_31)\n",
        "# And then plot it\n",
        "plot_neural(x, y, pre_1, pre_2, pre_3, act_1, act_2, act_3, w_act_1, w_act_2, w_act_3, plot_all=True)"
      ]
    }
  ],
  "metadata": {
-        "id": "IwodsBr0BkDn"
+    "colab": {
      "authorship_tag": "ABX9TyOmxhh3ymYWX+1HdZ91I6zU",
      "include_colab_link": true,
      "provenance": []
    },
-      "execution_count": null,
+    "kernelspec": {
-      "outputs": []
+      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    }
-  ]
+  },
  "nbformat": 4,
  "nbformat_minor": 0
 }