Created using Colaboratory

This commit is contained in:
udlbook
2023-07-25 13:14:42 -04:00
parent 322b3da22b
commit c2bf535ad2

View File

@@ -4,7 +4,7 @@
"metadata": { "metadata": {
"colab": { "colab": {
"provenance": [], "provenance": [],
"authorship_tag": "ABX9TyNO9SPfZa/RV0mp9XWuD3s5", "authorship_tag": "ABX9TyMF633wobzK487PJntQBvLE",
"include_colab_link": true "include_colab_link": true
}, },
"kernelspec": { "kernelspec": {
@@ -31,11 +31,11 @@
"source": [ "source": [
"# **Notebook 3.1 -- Shallow neural networks I**\n", "# **Notebook 3.1 -- Shallow neural networks I**\n",
"\n", "\n",
"The purpose of this notebook is to gain some familiarity with shallow neural networks. It works through an example similar to figure 3.3 and experiments with different activation functions. <br>\n", "The purpose of this notebook is to gain some familiarity with shallow neural networks with 2D inputs. It works through an example similar to figure 3.8 and experiments with different activation functions. <br>\n",
"\n", "\n",
"Work through the cells below, running each cell in turn. In various places you will see the words \"TO DO\". Follow the instructions at these places and write code to complete the functions. There are also questions interspersed in the text.\n", "Work through the cells below, running each cell in turn. In various places you will see the words \"TO DO\". Follow the instructions at these places and write code to complete the functions. There are also questions interspersed in the text.\n",
"\n", "\n",
"Contact me at udlbookmail@gmail.com if you find any mistakes or have any suggestions.\n" "Contact me at udlbookmail@gmail.com if you find any mistakes or have any suggestions."
], ],
"metadata": { "metadata": {
"id": "1Z6LB4Ybn1oN" "id": "1Z6LB4Ybn1oN"
@@ -279,166 +279,6 @@
"execution_count": null, "execution_count": null,
"outputs": [] "outputs": []
}, },
{
"cell_type": "markdown",
"source": [
"# Different activation functions\n",
"\n",
"The ReLU isn't the only kind of activation function. For a long time, people used sigmoid functions. A logistic sigmoid function is defined by the equation\n",
"\n",
"\\begin{equation}\n",
"f[h] = \\frac{1}{1+\\exp{[-10 z ]}}\n",
"\\end{equation}\n",
"\n",
"(Note that the factor of 10 is not standard -- but it allow us to plot on the same axes as the ReLU examples)"
],
"metadata": {
"id": "1NTT5GTbJSqK"
}
},
{
"cell_type": "code",
"source": [
"# Define the sigmoid function\n",
"def sigmoid(preactivation):\n",
" # TODO write code to implement the sigmoid function and compute the activation at the\n",
" # hidden unit from the preactivation. Use the np.exp() function.\n",
" activation = np.zeros_like(preactivation);\n",
"\n",
" return activation"
],
"metadata": {
"id": "FEzzQeVoZdV_"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# Make an array of inputs\n",
"z = np.arange(-1,1,0.01)\n",
"sig_z = sigmoid(z)\n",
"\n",
"# Plot the sigmoid function\n",
"fig, ax = plt.subplots()\n",
"ax.plot(z,sig_z,'r-')\n",
"ax.set_xlim([-1,1]);ax.set_ylim([0,1])\n",
"ax.set_xlabel('z'); ax.set_ylabel('sig[z]')\n",
"plt.show"
],
"metadata": {
"id": "dIn42wDlKqsv"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"Let's see what happens when we use this activation function in a neural network"
],
"metadata": {
"id": "uwQHGdC5KpH7"
}
},
{
"cell_type": "code",
"source": [
"theta_10 = 0.3 ; theta_11 = -1.0\n",
"theta_20 = -1.0 ; theta_21 = 2.0\n",
"theta_30 = -0.5 ; theta_31 = 0.65\n",
"phi_0 = 0.3; phi_1 = 0.5; phi_2 = -1.0; phi_3 = 0.9\n",
"\n",
"# Define a range of input values\n",
"x = np.arange(0,1,0.01)\n",
"\n",
"# We run the neural network for each of these input values\n",
"y, pre_1, pre_2, pre_3, act_1, act_2, act_3, w_act_1, w_act_2, w_act_3 = \\\n",
" shallow_1_1_3(x, sigmoid, phi_0,phi_1,phi_2,phi_3, theta_10, theta_11, theta_20, theta_21, theta_30, theta_31)\n",
"# And then plot it\n",
"plot_neural(x, y, pre_1, pre_2, pre_3, act_1, act_2, act_3, w_act_1, w_act_2, w_act_3, plot_all=True)"
],
"metadata": {
"id": "5W9m9MLKLddi"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"You probably notice that this gives nice smooth curves. So why don't we use this? Aha... it's not obvious right now, but we will get to it when we learn to fit models."
],
"metadata": {
"id": "0c4S-XfnSfDx"
}
},
{
"cell_type": "markdown",
"source": [
"# Linear activation functions\n",
"\n",
"However, neural networks don't work if the activation function is linear. For example, consider what would happen if the activation function was:\n",
"\n",
"\\begin{equation}\n",
"\\mbox{lin}[z] = a + bz\n",
"\\end{equation}"
],
"metadata": {
"id": "IA_v_-eLRqek"
}
},
{
"cell_type": "code",
"source": [
"# Define the linear activation function\n",
"def lin(preactivation):\n",
" a =0\n",
" b =1\n",
" # Compute linear function\n",
" activation = a+b * preactivation\n",
" # Return\n",
" return activation"
],
"metadata": {
"id": "fTHJRv0KLjMD"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# TODO\n",
"# 1. The linear activation function above just returns the input: (0+1*z) = z\n",
"# Before running the code Make a prediction about what the ten panels of the drawing will look like\n",
"# Now run the code below to see if you were right. What family of functions can this represent?\n",
"\n",
"# 2. What happens if you change the parameters (a,b) to different values?\n",
"# Try a=0.5, b=-0.4 (don't forget) to run the cell again to update the function\n",
"\n",
"\n",
"theta_10 = 0.3 ; theta_11 = -1.0\n",
"theta_20 = -1.0 ; theta_21 = 2.0\n",
"theta_30 = -0.5 ; theta_31 = 0.65\n",
"phi_0 = 0.3; phi_1 = 0.5; phi_2 = -1.0; phi_3 = 0.9\n",
"\n",
"# Define a range of input values\n",
"x = np.arange(0,1,0.01)\n",
"\n",
"# We run the neural network for each of these input values\n",
"y, pre_1, pre_2, pre_3, act_1, act_2, act_3, w_act_1, w_act_2, w_act_3 = \\\n",
" shallow_1_1_3(x, lin, phi_0,phi_1,phi_2,phi_3, theta_10, theta_11, theta_20, theta_21, theta_30, theta_31)\n",
"# And then plot it\n",
"plot_neural(x, y, pre_1, pre_2, pre_3, act_1, act_2, act_3, w_act_1, w_act_2, w_act_3, plot_all=True)"
],
"metadata": {
"id": "SauRG8r7TkvP"
},
"execution_count": null,
"outputs": []
},
{ {
"cell_type": "markdown", "cell_type": "markdown",
"source": [ "source": [