{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"
"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "s5zzKSOusPOB"
},
"source": [
"\n",
"# **Notebook 1.1 -- Background Mathematics**\n",
"\n",
"The purpose of this Python notebook is to make sure you can use CoLab and to familiarize yourself with some of the background mathematical concepts that you are going to need to understand deep learning.
It's not meant to be difficult and it may be that you know some or all of this information already.
Math is *NOT* a spectator sport. You won't learn it by just listening to lectures or reading books. It really helps to interact with it and explore yourself.
Work through the cells below, running each cell in turn. In various places you will see the words **\"TODO\"**. Follow the instructions at these places and write code to complete the functions. There are also questions interspersed in the text.\n",
"\n",
"Contact me at udlbookmail@gmail.com if you find any mistakes or have any suggestions."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "aUAjBbqzivMY"
},
"outputs": [],
"source": [
"# Imports math library\n",
"import numpy as np\n",
"# Imports plotting library\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "WV2Dl6owme2d"
},
"source": [
"**Linear functions**
We will be using the term *linear equation* to mean a weighted sum of inputs plus an offset. If there is just one input $x$, then this is a straight line:\n",
"\n",
"\\begin{equation}y=\\beta+\\omega x,\\end{equation}\n",
"\n",
"where $\\beta$ is the y-intercept of the linear and $\\omega$ is the slope of the line. When there are two inputs $x_{1}$ and $x_{2}$, then this becomes:\n",
"\n",
"\\begin{equation}y=\\beta+\\omega_1 x_1 + \\omega_2 x_2.\\end{equation}\n",
"\n",
"Any other functions are by definition **non-linear**.\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "WeFK4AvTotd8"
},
"outputs": [],
"source": [
"# Define a linear function with just one input, x\n",
"def linear_function_1D(x,beta,omega):\n",
" # TODO -- replace the code line below with formula for 1D linear equation\n",
" y = x\n",
"\n",
" return y"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "eimhJ8_jpmEp"
},
"outputs": [],
"source": [
"# Plot the 1D linear function\n",
"\n",
"# Define an array of x values from 0 to 10 with increments of 0.01\n",
"# https://numpy.org/doc/stable/reference/generated/numpy.arange.html\n",
"x = np.arange(0.0,10.0, 0.01)\n",
"# Compute y using the function you filled in above\n",
"beta = 0.0; omega = 1.0\n",
"\n",
"y = linear_function_1D(x,beta,omega)\n",
"\n",
"# Plot this function\n",
"fig, ax = plt.subplots()\n",
"ax.plot(x,y,'r-')\n",
"ax.set_ylim([0,10]);ax.set_xlim([0,10])\n",
"ax.set_xlabel('x'); ax.set_ylabel('y')\n",
"plt.show()\n",
"\n",
"# TODO -- experiment with changing the values of beta and omega\n",
"# to understand what they do. Try to make a line\n",
"# that crosses the y-axis at y=10 and the x-axis at x=5"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "AedfvD9dxShZ"
},
"source": [
"Now let's investigate a 2D linear function"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "57Gvkk-Ir_7b"
},
"outputs": [],
"source": [
"# Code to draw 2D function -- read it so you know what is going on, but you don't have to change it\n",
"def draw_2D_function(x1_mesh, x2_mesh, y):\n",
" fig, ax = plt.subplots()\n",
" fig.set_size_inches(7,7)\n",
" pos = ax.contourf(x1_mesh, x2_mesh, y, levels=256 ,cmap = 'hot', vmin=-10,vmax=10.0)\n",
" fig.colorbar(pos, ax=ax)\n",
" ax.set_xlabel('x1');ax.set_ylabel('x2')\n",
" levels = np.arange(-10,10,1.0)\n",
" ax.contour(x1_mesh, x2_mesh, y, levels, cmap='winter')\n",
" plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "YxeNhrXMzkZR"
},
"outputs": [],
"source": [
"# Define a linear function with two inputs, x1 and x2\n",
"def linear_function_2D(x1,x2,beta,omega1,omega2):\n",
" # TODO -- replace the code line below with formula for 2D linear equation\n",
" y = x1\n",
"\n",
" return y"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "rn_UBRDBysmR"
},
"outputs": [],
"source": [
"# Plot the 2D function\n",
"\n",
"# Make 2D array of x and y points\n",
"x1 = np.arange(0.0, 10.0, 0.1)\n",
"x2 = np.arange(0.0, 10.0, 0.1)\n",
"x1,x2 = np.meshgrid(x1,x2) # https://www.geeksforgeeks.org/numpy-meshgrid-function/\n",
"\n",
"# Compute the 2D function for given values of omega1, omega2\n",
"beta = 0.0; omega1 = 1.0; omega2 = -0.5\n",
"y = linear_function_2D(x1,x2,beta, omega1, omega2)\n",
"\n",
"# Draw the function.\n",
"# Color represents y value (brighter = higher value)\n",
"# Black = -10 or less, White = +10 or more\n",
"# 0 = mid orange\n",
"# Lines are contours where value is equal\n",
"draw_2D_function(x1,x2,y)\n",
"\n",
"# TODO\n",
"# Predict what this plot will look like if you set omega_1 to zero\n",
"# Change the code and see if you are right.\n",
"\n",
"# TODO\n",
"# Predict what this plot will look like if you set omega_2 to zero\n",
"# Change the code and see if you are right.\n",
"\n",
"# TODO\n",
"# Predict what this plot will look like if you set beta to -5\n",
"# Change the code and see if you are correct\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "i8tLwpls476R"
},
"source": [
"Often we will want to compute many linear functions at the same time. For example, we might have three inputs, $x_1$, $x_2$, and $x_3$ and want to compute two linear functions giving $y_1$ and $y_2$. Of course, we could do this by just running each equation separately,
\n",
"\n",
"\\begin{align}y_1 &=& \\beta_1 + \\omega_{11} x_1 + \\omega_{12} x_2 + \\omega_{13} x_3\\\\\n",
"y_2 &=& \\beta_2 + \\omega_{21} x_1 + \\omega_{22} x_2 + \\omega_{23} x_3.\n",
"\\end{align}\n",
"\n",
"However, we can write it more compactly with vectors and matrices:\n",
"\n",
"\\begin{equation}\n",
"\\begin{bmatrix} y_1\\\\ y_2 \\end{bmatrix} = \\begin{bmatrix}\\beta_{1}\\\\\\beta_{2}\\end{bmatrix}+ \\begin{bmatrix}\\omega_{11}&\\omega_{12}&\\omega_{13}\\\\\\omega_{21}&\\omega_{22}&\\omega_{23}\\end{bmatrix}\\begin{bmatrix}x_{1}\\\\x_{2}\\\\x_{3}\\end{bmatrix},\n",
"\\end{equation}\n",
"or\n",
"\n",
"\\begin{equation}\n",
"\\mathbf{y} = \\boldsymbol\\beta +\\boldsymbol\\Omega\\mathbf{x}.\n",
"\\end{equation}\n",
"\n",
"for short. Here, lowercase bold symbols are used for vectors. Upper case bold symbols are used for matrices.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "MjHXMavh9IUz"
},
"outputs": [],
"source": [
"# Define a linear function with three inputs, x1, x2, and x_3\n",
"def linear_function_3D(x1,x2,x3,beta,omega1,omega2,omega3):\n",
" # TODO -- replace the code below with formula for a single 3D linear equation\n",
" y = x1\n",
"\n",
" return y"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "fGzVJQ6N-mHJ"
},
"source": [
"Let's compute two linear equations, using both the individual equations and the vector / matrix form and check they give the same result"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Swd_bFIE9p2n"
},
"outputs": [],
"source": [
"# Define the parameters\n",
"beta1 = 0.5; beta2 = 0.2\n",
"omega11 = -1.0 ; omega12 = 0.4; omega13 = -0.3\n",
"omega21 = 0.1 ; omega22 = 0.1; omega23 = 1.2\n",
"\n",
"# Define the inputs\n",
"x1 = 4 ; x2 =-1; x3 = 2\n",
"\n",
"# Compute using the individual equations\n",
"y1 = linear_function_3D(x1,x2,x3,beta1,omega11,omega12,omega13)\n",
"y2 = linear_function_3D(x1,x2,x3,beta2,omega21,omega22,omega23)\n",
"print(\"Individual equations\")\n",
"print('y1 = %3.3f\\ny2 = %3.3f'%((y1,y2)))\n",
"\n",
"# Define vectors and matrices\n",
"beta_vec = np.array([[beta1],[beta2]])\n",
"omega_mat = np.array([[omega11,omega12,omega13],[omega21,omega22,omega23]])\n",
"x_vec = np.array([[x1], [x2], [x3]])\n",
"\n",
"# Compute with vector/matrix form\n",
"y_vec = beta_vec+np.matmul(omega_mat, x_vec)\n",
"print(\"Matrix/vector form\")\n",
"print('y1= %3.3f\\ny2 = %3.3f'%((y_vec[0][0],y_vec[1][0])))\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "3LGRoTMLU8ZU"
},
"source": [
"# Questions\n",
"\n",
"1. A single linear equation with three inputs (i.e. **linear_function_3D()**) associates a value y with each point in a 3D space ($x_1$,$x_2$,$x_3$). Is it possible to visualize this? What value is at position (0,0,0)?\n",
"\n",
"2. Write code to compute three linear equations with two inputs ($x_1$, $x_2$) using both the individual equations and the matrix form (you can make up any values for the inputs $\\beta_{i}$ and the slopes $\\omega_{ij}$."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "7Y5zdKtKZAB2"
},
"source": [
"# Special functions\n",
"\n",
"Throughout the book, we'll be using some special functions (see Appendix B.1.3). The most important of these are the logarithm and exponential functions. Let's investigate their properties.\n",
"\n",
"We'll start with the exponential function $y=\\exp[x]=e^x$ which maps the real line $(-\\infty,+\\infty)$ to positive numbers $(0,+\\infty)$."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "c_GkjiY9IWCu"
},
"outputs": [],
"source": [
"# Draw the exponential function\n",
"\n",
"# Define an array of x values from -5 to 5 with increments of 0.01\n",
"x = np.arange(-5.0,5.0, 0.01)\n",
"y = np.exp(x) ;\n",
"\n",
"# Plot this function\n",
"fig, ax = plt.subplots()\n",
"ax.plot(x,y,'r-')\n",
"ax.set_ylim([0,100]);ax.set_xlim([-5,5])\n",
"ax.set_xlabel('x'); ax.set_ylabel('exp[x]')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "XyrT8257IWCu"
},
"source": [
"# Questions\n",
"\n",
"1. What is $\\exp[0]$? \n",
"2. What is $\\exp[1]$?\n",
"3. What is $\\exp[-\\infty]$?\n",
"4. What is $\\exp[+\\infty]$?\n",
"5. A function is convex if we can draw a straight line between any two points on the function, and the line lies above the function everywhere between these two points. Similarly, a function is concave if a straight line between any two points lies below the function everywhere between these two points. Is the exponential function convex or concave or neither?\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "R6A4e5IxIWCu"
},
"source": [
"Now let's consider the logarithm function $y=\\log[x]$. Throughout the book we always use natural (base $e$) logarithms. The log function maps non-negative numbers $[0,\\infty]$ to real numbers $[-\\infty,\\infty]$. It is the inverse of the exponential function. So when we compute $\\log[x]$ we are really asking \"What is the number $y$ so that $e^y=x$?\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "fOR7v2iXIWCu"
},
"outputs": [],
"source": [
"# Draw the logarithm function\n",
"\n",
"# Define an array of x values from -5 to 5 with increments of 0.01\n",
"x = np.arange(0.01,5.0, 0.01)\n",
"y = np.log(x) ;\n",
"\n",
"# Plot this function\n",
"fig, ax = plt.subplots()\n",
"ax.plot(x,y,'r-')\n",
"ax.set_ylim([-5,5]);ax.set_xlim([0,5])\n",
"ax.set_xlabel('x'); ax.set_ylabel('$\\log[x]$')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "yYWrL5AXIWCv"
},
"source": [
"# Questions\n",
"\n",
"1. What is $\\log[0]$? \n",
"2. What is $\\log[1]$?\n",
"3. What is $\\log[e]$?\n",
"4. What is $\\log[\\exp[3]]$?\n",
"5. What is $\\exp[\\log[4]]$?\n",
"6. What is $\\log[-1]$?\n",
"7. Is the logarithm function concave or convex?\n"
]
}
],
"metadata": {
"colab": {
"provenance": [],
"include_colab_link": true
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.10"
}
},
"nbformat": 4,
"nbformat_minor": 0
}