From 8411fdd1d23a0714407d3dfe180dc1a3eb2ce63f Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Fri, 24 Nov 2023 11:18:20 +0100
Subject: [PATCH 01/18] Update 21_1_Bias_Mitigation.ipynb

---
 Notebooks/Chap21/21_1_Bias_Mitigation.ipynb | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Notebooks/Chap21/21_1_Bias_Mitigation.ipynb b/Notebooks/Chap21/21_1_Bias_Mitigation.ipynb
index 9962b3d..ad197ef 100644
--- a/Notebooks/Chap21/21_1_Bias_Mitigation.ipynb
+++ b/Notebooks/Chap21/21_1_Bias_Mitigation.ipynb
@@ -31,7 +31,7 @@
       "source": [
         "# **Notebook 21.1: Bias mitigation**\n",
         "\n",
-        "This notebook investigates a post-processing method for bias mitigation (see figure 21.2 in the book). It based on this [blog](https://www.borealisai.com/research-blogs/tutorial1-bias-and-fairness-ai/) that I wrote for Borealis AI in 2019, which itself was derirved from [this blog](https://research.google.com/bigpicture/attacking-discrimination-in-ml/) by Wattenberg, Viégas, and Hardt.\n",
+        "This notebook investigates a post-processing method for bias mitigation (see figure 21.2 in the book). It based on this [blog](https://www.borealisai.com/research-blogs/tutorial1-bias-and-fairness-ai/) that I wrote for Borealis AI in 2019, which itself was derived from [this blog](https://research.google.com/bigpicture/attacking-discrimination-in-ml/) by Wattenberg, Viégas, and Hardt.\n",
         "\n",
         "Work through the cells below, running each cell in turn. In various places you will see the words \"TO DO\". Follow the instructions at these places and make predictions about what is going to happen or write code to complete the functions.\n",
         "\n",
@@ -172,7 +172,7 @@
       "source": [
         "# Blindness to protected attribute\n",
         "\n",
-        "We'll first do the simplest possible thing.  We'll choose the same threshold for both blue and yellow populations so that $\\tau_0$ = $\\tau_1$.  Basically, we'll ingore what we know about the group membership.  Let's see what the ramifications of that."
+        "We'll first do the simplest possible thing.  We'll choose the same threshold for both blue and yellow populations so that $\\tau_0$ = $\\tau_1$.  Basically, we'll ignore what we know about the group membership.  Let's see what the ramifications of that."
       ],
       "metadata": {
         "id": "bE7yPyuWoSUy"
@@ -195,7 +195,7 @@
       "source": [
         "def compute_probability_get_loan(credit_scores, frequencies, threshold):\n",
         "  # TODO - Write this function\n",
-        "  # Return the probability that somemone from this group loan based on the frequencies of each\n",
+        "  # Return the probability that someone from this group loan based on the frequencies of each\n",
         "  # credit score for this group\n",
         "  # Replace this line:\n",
         "  prob = 0.5\n",
@@ -297,7 +297,7 @@
         "\n",
         "This criterion is clearly not great.  The blue and yellow groups get given loans at different rates overall, and (for this threshold), the false alarms and true positives are also different, so it's not even fair when we consider whether the loans really were paid back.  \n",
         "\n",
-        "TODO -- investigate setting a different threshols $\\tau_{0}=\\tau_{1}$.  Is it possible to make the overall rates that loans are given the same?  Is it possible to make the false alarm rates the same?  Is it possible to make the true positive rates the same?"
+        "TODO -- investigate setting a different threshold $\\tau_{0}=\\tau_{1}$.  Is it possible to make the overall rates that loans are given the same?  Is it possible to make the false alarm rates the same?  Is it possible to make the true positive rates the same?"
       ],
       "metadata": {
         "id": "UCObTsa57uuC"

From da3a5ad2e9dcf8ec5461e7e61f538bc1c4dbb6ab Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Fri, 24 Nov 2023 11:18:22 +0100
Subject: [PATCH 02/18] Update 21_2_Explainability.ipynb

---
 Notebooks/Chap21/21_2_Explainability.ipynb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Notebooks/Chap21/21_2_Explainability.ipynb b/Notebooks/Chap21/21_2_Explainability.ipynb
index d0b28eb..e64cc41 100644
--- a/Notebooks/Chap21/21_2_Explainability.ipynb
+++ b/Notebooks/Chap21/21_2_Explainability.ipynb
@@ -400,7 +400,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "This model is easilly intepretable.  The k'th coeffeicient tells us the how much (and in which direction) changing the value of the k'th input will change the output.  This is only valid in the vicinity of the input $x$.\n",
+        "This model is easily interpretable.  The k'th coefficient tells us the how much (and in which direction) changing the value of the k'th input will change the output.  This is only valid in the vicinity of the input $x$.\n",
         "\n",
         "Note that a more sophisticated version of LIME would weight the training points according to how close they are to the original data point of interest."
       ],

From ffe7ffc823bc1e53468c5815282fc787e5efad9a Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Fri, 24 Nov 2023 11:21:47 +0100
Subject: [PATCH 03/18] Update 12_1_Self_Attention.ipynb

---
 Notebooks/Chap12/12_1_Self_Attention.ipynb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Notebooks/Chap12/12_1_Self_Attention.ipynb b/Notebooks/Chap12/12_1_Self_Attention.ipynb
index 3a26f94..eb584b8 100644
--- a/Notebooks/Chap12/12_1_Self_Attention.ipynb
+++ b/Notebooks/Chap12/12_1_Self_Attention.ipynb
@@ -31,7 +31,7 @@
       "source": [
         "# **Notebook 12.1: Self Attention**\n",
         "\n",
-        "This notebook builds a self-attnetion mechanism from scratch, as discussed in section 12.2 of the book.\n",
+        "This notebook builds a self-attention mechanism from scratch, as discussed in section 12.2 of the book.\n",
         "\n",
         "Work through the cells below, running each cell in turn. In various places you will see the words \"TO DO\". Follow the instructions at these places and make predictions about what is going to happen or write code to complete the functions.\n",
         "\n",

From a7af9f559ef0bcffddcc3bb1a6707de17dafde03 Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Sun, 26 Nov 2023 11:18:22 +0100
Subject: [PATCH 04/18] Update 1_1_BackgroundMathematics.ipynb

---
 Notebooks/Chap01/1_1_BackgroundMathematics.ipynb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Notebooks/Chap01/1_1_BackgroundMathematics.ipynb b/Notebooks/Chap01/1_1_BackgroundMathematics.ipynb
index c13cf11..433732d 100644
--- a/Notebooks/Chap01/1_1_BackgroundMathematics.ipynb
+++ b/Notebooks/Chap01/1_1_BackgroundMathematics.ipynb
@@ -171,7 +171,7 @@
         "# Color represents y value (brighter = higher value)\n",
         "# Black = -10 or less, White = +10 or more\n",
         "# 0 = mid orange\n",
-        "# Lines are conoturs where value is equal\n",
+        "# Lines are contours where value is equal\n",
         "draw_2D_function(x1,x2,y)\n",
         "\n",
         "# TODO\n",

From e03fad482b3379186a29e1f8a4e6ffbddd36e20d Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Thu, 30 Nov 2023 16:33:04 +0100
Subject: [PATCH 05/18] Update CM20315_Convolution_II.ipynb

---
 CM20315/CM20315_Convolution_II.ipynb | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/CM20315/CM20315_Convolution_II.ipynb b/CM20315/CM20315_Convolution_II.ipynb
index 6ddf658..4d87123 100644
--- a/CM20315/CM20315_Convolution_II.ipynb
+++ b/CM20315/CM20315_Convolution_II.ipynb
@@ -105,7 +105,7 @@
       "cell_type": "code",
       "source": [
         "\n",
-        "# TODO Create a model with the folowing layers\n",
+        "# TODO Create a model with the following layers\n",
         "# 1. Convolutional layer, (input=length 40 and 1 channel, kernel size 3x3, stride 2, padding=\"valid\", 15 output channels ) \n",
         "# 2. ReLU\n",
         "# 3. Convolutional layer, (input=length 19 and 15 channels, kernel size 3x3, stride 2, padding=\"valid\", 15 output channels )\n",
@@ -120,7 +120,7 @@
         "# https://pytorch.org/docs/1.13/generated/torch.nn.Linear.html?highlight=linear#torch.nn.Linear\n",
         "\n",
         "# Replace the following function which just runs a standard fully connected network\n",
-        "# The flatten at the beginning is becuase we are passing in the data in a slightly different format.\n",
+        "# The flatten at the beginning is because we are passing in the data in a slightly different format.\n",
         "model = nn.Sequential(\n",
         "nn.Flatten(),\n",
         "nn.Linear(40, 100),\n",

From ef28d848dfaef37681d3ddc720b42c22d7759ace Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Thu, 30 Nov 2023 16:33:59 +0100
Subject: [PATCH 06/18] Update CM20315_Convolution_III.ipynb

---
 CM20315/CM20315_Convolution_III.ipynb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/CM20315/CM20315_Convolution_III.ipynb b/CM20315/CM20315_Convolution_III.ipynb
index 44748d6..ce0039f 100644
--- a/CM20315/CM20315_Convolution_III.ipynb
+++ b/CM20315/CM20315_Convolution_III.ipynb
@@ -148,7 +148,7 @@
         "# 8. A flattening operation\n",
         "# 9. A fully connected layer mapping from (whatever dimensions we are at-- find out using .shape) to 50 \n",
         "# 10. A ReLU\n",
-        "# 11. A fully connected layer mappiing from 50 to 10 dimensions\n",
+        "# 11. A fully connected layer mapping from 50 to 10 dimensions\n",
         "# 12. A softmax function.\n",
         "\n",
         "# Replace this class which implements a minimal network (which still does okay)\n",

From 6b2f25101e19aae53d49b1d05cf0b061156820a5 Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Thu, 30 Nov 2023 16:36:35 +0100
Subject: [PATCH 07/18] Update CM20315_Gradients_II.ipynb

---
 CM20315/CM20315_Gradients_II.ipynb | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/CM20315/CM20315_Gradients_II.ipynb b/CM20315/CM20315_Gradients_II.ipynb
index ebefa76..d98b315 100644
--- a/CM20315/CM20315_Gradients_II.ipynb
+++ b/CM20315/CM20315_Gradients_II.ipynb
@@ -32,7 +32,7 @@
       "source": [
         "# Gradients II: Backpropagation algorithm\n",
         "\n",
-        "In this practical, we'll investigate the backpropagation algoritithm.  This computes the gradients of the loss with respect to all of the parameters (weights and biases) in the network.  We'll use these gradients when we run stochastic gradient descent."
+        "In this practical, we'll investigate the backpropagation algorithm.  This computes the gradients of the loss with respect to all of the parameters (weights and biases) in the network.  We'll use these gradients when we run stochastic gradient descent."
       ],
       "metadata": {
         "id": "L6chybAVFJW2"
@@ -53,7 +53,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "First let's define a neural network.  We'll just choose the weights and biaes randomly for now"
+        "First let's define a neural network.  We'll just choose the weights and biases randomly for now"
       ],
       "metadata": {
         "id": "nnUoI0m6GyjC"
@@ -178,7 +178,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "Now let's define a loss function.  We'll just use the least squaures loss function. We'll also write a function to compute dloss_doutpu"
+        "Now let's define a loss function.  We'll just use the least squares loss function. We'll also write a function to compute dloss_doutpu"
       ],
       "metadata": {
         "id": "SxVTKp3IcoBF"

From 79578aa4a109bf3af013e20d96d4e33a8731db37 Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Thu, 30 Nov 2023 16:37:12 +0100
Subject: [PATCH 08/18] Update CM20315_Gradients_III.ipynb

---
 CM20315/CM20315_Gradients_III.ipynb | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/CM20315/CM20315_Gradients_III.ipynb b/CM20315/CM20315_Gradients_III.ipynb
index 052ba8e..e691d04 100644
--- a/CM20315/CM20315_Gradients_III.ipynb
+++ b/CM20315/CM20315_Gradients_III.ipynb
@@ -53,7 +53,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "First let's define a neural network.  We'll just choose the weights and biaes randomly for now"
+        "First let's define a neural network.  We'll just choose the weights and biases randomly for now"
       ],
       "metadata": {
         "id": "nnUoI0m6GyjC"
@@ -204,7 +204,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "Now let's define a loss function.  We'll just use the least squaures loss function. We'll also write a function to compute dloss_doutput\n"
+        "Now let's define a loss function.  We'll just use the least squares loss function. We'll also write a function to compute dloss_doutput\n"
       ],
       "metadata": {
         "id": "SxVTKp3IcoBF"

From c951720282314864afd8ea25b54601f22bb8d15f Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Thu, 30 Nov 2023 16:37:46 +0100
Subject: [PATCH 09/18] Update CM20315_Intro_Answers.ipynb

---
 CM20315/CM20315_Intro_Answers.ipynb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/CM20315/CM20315_Intro_Answers.ipynb b/CM20315/CM20315_Intro_Answers.ipynb
index 4efdb8d..1eff4fe 100644
--- a/CM20315/CM20315_Intro_Answers.ipynb
+++ b/CM20315/CM20315_Intro_Answers.ipynb
@@ -215,7 +215,7 @@
         "# Color represents y value (brighter = higher value)\n",
         "# Black = -10 or less, White = +10 or more\n",
         "# 0 = mid orange\n",
-        "# Lines are conoturs where value is equal\n",
+        "# Lines are contours where value is equal\n",
         "draw_2D_function(x1,x2,y)\n",
         "\n",
         "# TODO\n",

From 6c8411ae1c392bd8dab1c99f3a3f5e71d0264297 Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Thu, 30 Nov 2023 16:38:17 +0100
Subject: [PATCH 10/18] Update CM20315_Intro.ipynb

---
 CM20315/CM20315_Intro.ipynb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/CM20315/CM20315_Intro.ipynb b/CM20315/CM20315_Intro.ipynb
index cc4f90a..ba811db 100644
--- a/CM20315/CM20315_Intro.ipynb
+++ b/CM20315/CM20315_Intro.ipynb
@@ -176,7 +176,7 @@
         "# Color represents y value (brighter = higher value)\n",
         "# Black = -10 or less, White = +10 or more\n",
         "# 0 = mid orange\n",
-        "# Lines are conoturs where value is equal\n",
+        "# Lines are contours where value is equal\n",
         "draw_2D_function(x1,x2,y)\n",
         "\n",
         "# TODO\n",

From 428ca727dbc03e8b613008bbc59bbae9e90a8a5e Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Thu, 30 Nov 2023 16:39:42 +0100
Subject: [PATCH 11/18] Update CM20315_Loss_II.ipynb

---
 CM20315/CM20315_Loss_II.ipynb | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/CM20315/CM20315_Loss_II.ipynb b/CM20315/CM20315_Loss_II.ipynb
index 69c1074..9846d61 100644
--- a/CM20315/CM20315_Loss_II.ipynb
+++ b/CM20315/CM20315_Loss_II.ipynb
@@ -34,7 +34,7 @@
         "\n",
         "This practical investigates loss functions.  In part I we investigated univariate regression (where the output data $y$ is continuous.  Our formulation was based on the normal/Gaussian distribution.\n",
         "\n",
-        "In this notebook, we investigate binary classification (where the output data is 0 or 1).  This will be based on the Bernouilli distribution\n",
+        "In this notebook, we investigate binary classification (where the output data is 0 or 1).  This will be based on the Bernoulli distribution\n",
         "\n",
         "In part III we'll investigate multiclass classification (where the outputs data can take multiple values 1,... K.\n",
         "\n",
@@ -199,7 +199,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "The left is model output and the right is the model output after the sigmoid has been applied, so it now lies in the range [0,1] and represents the probabiilty, that y=1.  The black dots show the training data.  We'll compute the the likelihood and the negative log likelihood."
+        "The left is model output and the right is the model output after the sigmoid has been applied, so it now lies in the range [0,1] and represents the probability, that y=1.  The black dots show the training data.  We'll compute the the likelihood and the negative log likelihood."
       ],
       "metadata": {
         "id": "MvVX6tl9AEXF"
@@ -210,7 +210,7 @@
       "source": [
         "# Return probability under Bernoulli distribution for input x\n",
         "def bernoulli_distribution(y, lambda_param):\n",
-        "    # TODO-- write in the equation for the Bernoullid distribution \n",
+        "    # TODO-- write in the equation for the Bernoulli distribution \n",
         "    # Equation 5.17 from the notes (you will need np.power)\n",
         "    # Replace the line below\n",
         "    prob = np.zeros_like(y)\n",
@@ -249,7 +249,7 @@
       "source": [
         "# Return the likelihood of all of the data under the model\n",
         "def compute_likelihood(y_train, lambda_param):\n",
-        "  # TODO -- compute the likelihood of the data -- the product of the Bernoullis probabilities for each data point\n",
+        "  # TODO -- compute the likelihood of the data -- the product of the Bernoulli's probabilities for each data point\n",
         "  # Top line of equation 5.3 in the notes\n",
         "  # You will need np.prod() and the bernoulli_distribution function you used above\n",
         "  # Replace the line below\n",
@@ -284,7 +284,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "You can see that this gives a very small answer, even for this small 1D dataset, and with the model fitting quite well.  This is because it is the product of sveral probabilities, which are all quite small themselves.\n",
+        "You can see that this gives a very small answer, even for this small 1D dataset, and with the model fitting quite well.  This is because it is the product of several probabilities, which are all quite small themselves.\n",
         "This will get out of hand pretty quickly with real datasets -- the likelihood will get so small that we can't represent it with normal finite-precision math\n",
         "\n",
         "This is why we use negative log likelihood"
@@ -317,7 +317,7 @@
         "beta_0, omega_0, beta_1, omega_1 = get_parameters()\n",
         "# Use our neural network to predict the mean of the Gaussian\n",
         "model_out = shallow_nn(x_train, beta_0, omega_0, beta_1, omega_1)\n",
-        "# Set the standard devation to something reasonable\n",
+        "# Set the standard deviation to something reasonable\n",
         "lambda_train = sigmoid(model_out)\n",
         "# Compute the log likelihood\n",
         "nll = compute_negative_log_likelihood(y_train, lambda_train)\n",
@@ -362,7 +362,7 @@
       "source": [
         "# Define a range of values for the parameter\n",
         "beta_1_vals = np.arange(-2,6.0,0.1)\n",
-        "# Create some arrays to store the likelihoods, negative log likehoods\n",
+        "# Create some arrays to store the likelihoods, negative log likelihoods\n",
         "likelihoods = np.zeros_like(beta_1_vals)\n",
         "nlls = np.zeros_like(beta_1_vals)\n",
         "\n",

From a5d98bb379e9386ab0cfb5ef275f79a392b69485 Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Thu, 30 Nov 2023 16:40:52 +0100
Subject: [PATCH 12/18] Update CM20315_Loss_III.ipynb

---
 CM20315/CM20315_Loss_III.ipynb | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/CM20315/CM20315_Loss_III.ipynb b/CM20315/CM20315_Loss_III.ipynb
index 7b1c005..ef914c9 100644
--- a/CM20315/CM20315_Loss_III.ipynb
+++ b/CM20315/CM20315_Loss_III.ipynb
@@ -33,7 +33,7 @@
         "# Loss functions part III\n",
         "\n",
         "This practical investigates loss functions.  In part I we investigated univariate regression (where the output data $y$ is continuous.  Our formulation was based on the normal/Gaussian distribution.\n",
-        "In part II we investigated binary classification (where the output data is 0 or 1).  This will be based on the Bernouilli distribution.<br><br>\n",
+        "In part II we investigated binary classification (where the output data is 0 or 1).  This will be based on the Bernoulli distribution.<br><br>\n",
         "\n",
         "Now we'll investigate multiclass classification (where the outputs data can take multiple values 1,... K, which is based on the categorical distribution\n",
         "\n",
@@ -218,7 +218,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "The left is model output and the right is the model output after the softmax has been applied, so it now lies in the range [0,1] and represents the probabiilty, that y=0 (red), 1 (green) and 2 (blue)   The dots at the bottom show the training data with the same color scheme.  So we want the red curve to be high where there are red dots, the green curve to be high where there are green dotsmand the blue curve to be high where there are blue dots  We'll compute the the likelihood and the negative log likelihood."
+        "The left is model output and the right is the model output after the softmax has been applied, so it now lies in the range [0,1] and represents the probability, that y=0 (red), 1 (green) and 2 (blue)   The dots at the bottom show the training data with the same color scheme.  So we want the red curve to be high where there are red dots, the green curve to be high where there are green dotsmand the blue curve to be high where there are blue dots  We'll compute the the likelihood and the negative log likelihood."
       ],
       "metadata": {
         "id": "MvVX6tl9AEXF"
@@ -228,7 +228,7 @@
       "cell_type": "code",
       "source": [
         "# Return probability under Bernoulli distribution for input x\n",
-        "# Complicated code to commpute it but just take value from row k of lambda param where y =k, \n",
+        "# Complicated code to compute it but just take value from row k of lambda param where y =k, \n",
         "def categorical_distribution(y, lambda_param):\n",
         "    prob = np.zeros_like(y)\n",
         "    for row_index in range(lambda_param.shape[0]):\n",
@@ -305,7 +305,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "You can see that this gives a very small answer, even for this small 1D dataset, and with the model fitting quite well.  This is because it is the product of sveral probabilities, which are all quite small themselves.\n",
+        "You can see that this gives a very small answer, even for this small 1D dataset, and with the model fitting quite well.  This is because it is the product of several probabilities, which are all quite small themselves.\n",
         "This will get out of hand pretty quickly with real datasets -- the likelihood will get so small that we can't represent it with normal finite-precision math\n",
         "\n",
         "This is why we use negative log likelihood"
@@ -338,7 +338,7 @@
         "beta_0, omega_0, beta_1, omega_1 = get_parameters()\n",
         "# Use our neural network to predict the mean of the Gaussian\n",
         "model_out = shallow_nn(x_train, beta_0, omega_0, beta_1, omega_1)\n",
-        "# Set the standard devation to something reasonable\n",
+        "# Set the standard deviation to something reasonable\n",
         "lambda_train = softmax(model_out)\n",
         "# Compute the log likelihood\n",
         "nll = compute_negative_log_likelihood(y_train, lambda_train)\n",
@@ -365,7 +365,7 @@
       "source": [
         "# Define a range of values for the parameter\n",
         "beta_1_vals = np.arange(-2,6.0,0.1)\n",
-        "# Create some arrays to store the likelihoods, negative log likehoods\n",
+        "# Create some arrays to store the likelihoods, negative log likelihoods\n",
         "likelihoods = np.zeros_like(beta_1_vals)\n",
         "nlls = np.zeros_like(beta_1_vals)\n",
         "\n",

From 6b76bbc7c32801e03a470358c893b8c3aaf06222 Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Thu, 30 Nov 2023 16:42:27 +0100
Subject: [PATCH 13/18] Update CM20315_Loss.ipynb

---
 CM20315/CM20315_Loss.ipynb | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/CM20315/CM20315_Loss.ipynb b/CM20315/CM20315_Loss.ipynb
index 1edf36f..8761997 100644
--- a/CM20315/CM20315_Loss.ipynb
+++ b/CM20315/CM20315_Loss.ipynb
@@ -36,7 +36,7 @@
         "\n",
         "We'll compute loss functions for maximum likelihood, minimum negative log likelihood, and least squares and show that they all imply that we should use the same parameter values\n",
         "\n",
-        "In part II, we'll investigate binary classification (where the output data is 0 or 1).  This will be based on the Bernouilli distribution\n",
+        "In part II, we'll investigate binary classification (where the output data is 0 or 1).  This will be based on the Bernoulli distribution\n",
         "\n",
         "In part III we'll investigate multiclass classification (where the output data is 0,1, or, 2).  This will be based on the categorical distribution."
       ],
@@ -178,7 +178,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "The blue line i sthe mean prediction of the model and the gray area represents plus/minus two standardard deviations.  This model fits okay, but could be improved. Let's compute the loss.  We'll compute the  the least squares error, the likelihood, the negative log likelihood."
+        "The blue line is the mean prediction of the model and the gray area represents plus/minus two standard deviations.  This model fits okay, but could be improved. Let's compute the loss.  We'll compute the  the least squares error, the likelihood, the negative log likelihood."
       ],
       "metadata": {
         "id": "MvVX6tl9AEXF"
@@ -276,7 +276,7 @@
         "beta_0, omega_0, beta_1, omega_1 = get_parameters()\n",
         "# Use our neural network to predict the mean of the Gaussian\n",
         "mu_pred = shallow_nn(x_train, beta_0, omega_0, beta_1, omega_1)\n",
-        "# Set the standard devation to something reasonable\n",
+        "# Set the standard deviation to something reasonable\n",
         "sigma = 0.2\n",
         "# Compute the likelihood\n",
         "likelihood = compute_likelihood(y_train, mu_pred, sigma)\n",
@@ -292,7 +292,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "You can see that this gives a very small answer, even for this small 1D dataset, and with the model fitting quite well.  This is because it is the product of sveral probabilities, which are all quite small themselves.\n",
+        "You can see that this gives a very small answer, even for this small 1D dataset, and with the model fitting quite well.  This is because it is the product of several probabilities, which are all quite small themselves.\n",
         "This will get out of hand pretty quickly with real datasets -- the likelihood will get so small that we can't represent it with normal finite-precision math\n",
         "\n",
         "This is why we use negative log likelihood"
@@ -326,7 +326,7 @@
         "beta_0, omega_0, beta_1, omega_1 = get_parameters()\n",
         "# Use our neural network to predict the mean of the Gaussian\n",
         "mu_pred = shallow_nn(x_train, beta_0, omega_0, beta_1, omega_1)\n",
-        "# Set the standard devation to something reasonable\n",
+        "# Set the standard deviation to something reasonable\n",
         "sigma = 0.2\n",
         "# Compute the log likelihood\n",
         "nll = compute_negative_log_likelihood(y_train, mu_pred, sigma)\n",
@@ -397,7 +397,7 @@
       "source": [
         "# Define a range of values for the parameter\n",
         "beta_1_vals = np.arange(0,1.0,0.01)\n",
-        "# Create some arrays to store the likelihoods, negative log likehoos and sum of squares\n",
+        "# Create some arrays to store the likelihoods, negative log likelihoods and sum of squares\n",
         "likelihoods = np.zeros_like(beta_1_vals)\n",
         "nlls = np.zeros_like(beta_1_vals)\n",
         "sum_squares = np.zeros_like(beta_1_vals)\n",
@@ -482,7 +482,7 @@
       "source": [
         "# Define a range of values for the parameter\n",
         "sigma_vals = np.arange(0.1,0.5,0.005)\n",
-        "# Create some arrays to store the likelihoods, negative log likehoos and sum of squares\n",
+        "# Create some arrays to store the likelihoods, negative log likelihoods and sum of squares\n",
         "likelihoods = np.zeros_like(sigma_vals)\n",
         "nlls = np.zeros_like(sigma_vals)\n",
         "sum_squares = np.zeros_like(sigma_vals)\n",

From 4429600bcc2a6899283d19ae20d3c01be69ea578 Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Thu, 30 Nov 2023 16:43:27 +0100
Subject: [PATCH 14/18] Update CM20315_Shallow.ipynb

---
 CM20315/CM20315_Shallow.ipynb | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/CM20315/CM20315_Shallow.ipynb b/CM20315/CM20315_Shallow.ipynb
index 68dbc8d..f19d29d 100644
--- a/CM20315/CM20315_Shallow.ipynb
+++ b/CM20315/CM20315_Shallow.ipynb
@@ -233,7 +233,7 @@
         "# TODO\n",
         "# 1. Predict what effect changing phi_0 will have on the network.  \n",
         "#   Answer:\n",
-        "# 2. Predict what effect multplying phi_1, phi_2, phi_3 by 0.5 would have.  Check if you are correct\n",
+        "# 2. Predict what effect multiplying phi_1, phi_2, phi_3 by 0.5 would have.  Check if you are correct\n",
         "#   Answer:\n",
         "# 3. Predict what effect multiplying phi_1 by -1 will have.  Check if you are correct.\n",
         "#   Answer:\n",
@@ -500,7 +500,7 @@
         "print(\"Loss = %3.3f\"%(loss))\n",
         "\n",
         "# TODO.  Manipulate the parameters (by hand!) to make the function \n",
-        "# fit the data better and try to reduct the loss to as small a number \n",
+        "# fit the data better and try to reduce the loss to as small a number \n",
         "# as possible.  The best that I could do was 0.181\n",
         "# Tip... start by manipulating phi_0.\n",
         "# It's not that easy, so don't spend too much time on this!"

From 685d910bbc6edb4b36b2210e897bdd9d71a49662 Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Thu, 30 Nov 2023 16:44:01 +0100
Subject: [PATCH 15/18] Update CM20315_Training_I.ipynb

---
 CM20315/CM20315_Training_I.ipynb | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/CM20315/CM20315_Training_I.ipynb b/CM20315/CM20315_Training_I.ipynb
index 4913afe..f1438e8 100644
--- a/CM20315/CM20315_Training_I.ipynb
+++ b/CM20315/CM20315_Training_I.ipynb
@@ -108,7 +108,7 @@
       "source": [
         "def line_search(loss_function, thresh=.0001, max_iter = 10, draw_flag = False):\n",
         "\n",
-        "    # Initialize four points along the rnage we are going to search\n",
+        "    # Initialize four points along the range we are going to search\n",
         "    a = 0\n",
         "    b = 0.33\n",
         "    c = 0.66\n",
@@ -139,7 +139,7 @@
         "        # Rule #2 If point b is less than point c then\n",
         "        #                     then point d becomes point c, and\n",
         "        #                     point b becomes 1/3 between a and new d\n",
-        "        #                     point c beocome 2/3 between a and new d \n",
+        "        #                     point c becomes 2/3 between a and new d \n",
         "        # TODO REPLACE THE BLOCK OF CODE BELOW WITH THIS RULE\n",
         "        if (0):\n",
         "          continue;\n",
@@ -147,7 +147,7 @@
         "        # Rule #3 If point c is less than point b then\n",
         "        #                     then point a becomes point b, and\n",
         "        #                     point b becomes 1/3 between new a and d\n",
-        "        #                     point c beocome 2/3 between new a and d \n",
+        "        #                     point c becomes 2/3 between new a and d \n",
         "        # TODO REPLACE THE BLOCK OF CODE BELOW WITH THIS RULE\n",
         "        if(0):\n",
         "          continue\n",

From 9b13823ca8e3d0818b160408332f58373c8746e8 Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Thu, 30 Nov 2023 16:44:42 +0100
Subject: [PATCH 16/18] Update CM20315_Training_II.ipynb

---
 CM20315/CM20315_Training_II.ipynb | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/CM20315/CM20315_Training_II.ipynb b/CM20315/CM20315_Training_II.ipynb
index ede8f18..ac09b14 100644
--- a/CM20315/CM20315_Training_II.ipynb
+++ b/CM20315/CM20315_Training_II.ipynb
@@ -114,7 +114,7 @@
     {
       "cell_type": "code",
       "source": [
-        "# Initialize the parmaeters and draw the model\n",
+        "# Initialize the parameters and draw the model\n",
         "phi = np.zeros((2,1))\n",
         "phi[0] = 0.6      # Intercept\n",
         "phi[1] = -0.2      # Slope\n",
@@ -314,7 +314,7 @@
         "  return compute_loss(data[0,:], data[1,:], model, phi_start+ gradient * dist_prop)\n",
         "\n",
         "def line_search(data, model, phi, gradient, thresh=.00001, max_dist = 0.1, max_iter = 15, verbose=False):\n",
-        "    # Initialize four points along the rnage we are going to search\n",
+        "    # Initialize four points along the range we are going to search\n",
         "    a = 0\n",
         "    b = 0.33 * max_dist\n",
         "    c = 0.66 * max_dist\n",
@@ -345,7 +345,7 @@
         "        # Rule #2 If point b is less than point c then\n",
         "        #                     then point d becomes point c, and\n",
         "        #                     point b becomes 1/3 between a and new d\n",
-        "        #                     point c beocome 2/3 between a and new d \n",
+        "        #                     point c becomes 2/3 between a and new d \n",
         "        if lossb < lossc:\n",
         "          d = c\n",
         "          b = a+ (d-a)/3\n",
@@ -355,7 +355,7 @@
         "        # Rule #2 If point c is less than point b then\n",
         "        #                     then point a becomes point b, and\n",
         "        #                     point b becomes 1/3 between new a and d\n",
-        "        #                     point c beocome 2/3 between new a and d \n",
+        "        #                     point c becomes 2/3 between new a and d \n",
         "        a = b\n",
         "        b = a+ (d-a)/3\n",
         "        c = a+ 2*(d-a)/3\n",

From 193e2329f2b8428bd623e9a344d56b7ca2850a8d Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Thu, 30 Nov 2023 16:45:28 +0100
Subject: [PATCH 17/18] Update CM20315_Training_III.ipynb

---
 CM20315/CM20315_Training_III.ipynb | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/CM20315/CM20315_Training_III.ipynb b/CM20315/CM20315_Training_III.ipynb
index 4e38276..f91c021 100644
--- a/CM20315/CM20315_Training_III.ipynb
+++ b/CM20315/CM20315_Training_III.ipynb
@@ -340,7 +340,7 @@
         "  return compute_loss(data[0,:], data[1,:], model, phi_start+ gradient * dist_prop)\n",
         "\n",
         "def line_search(data, model, phi, gradient, thresh=.00001, max_dist = 0.1, max_iter = 15, verbose=False):\n",
-        "    # Initialize four points along the rnage we are going to search\n",
+        "    # Initialize four points along the range we are going to search\n",
         "    a = 0\n",
         "    b = 0.33 * max_dist\n",
         "    c = 0.66 * max_dist\n",
@@ -371,7 +371,7 @@
         "        # Rule #2 If point b is less than point c then\n",
         "        #                     then point d becomes point c, and\n",
         "        #                     point b becomes 1/3 between a and new d\n",
-        "        #                     point c beocome 2/3 between a and new d \n",
+        "        #                     point c becomes 2/3 between a and new d \n",
         "        if lossb < lossc:\n",
         "          d = c\n",
         "          b = a+ (d-a)/3\n",
@@ -381,7 +381,7 @@
         "        # Rule #2 If point c is less than point b then\n",
         "        #                     then point a becomes point b, and\n",
         "        #                     point b becomes 1/3 between new a and d\n",
-        "        #                     point c beocome 2/3 between new a and d \n",
+        "        #                     point c becomes 2/3 between new a and d \n",
         "        a = b\n",
         "        b = a+ (d-a)/3\n",
         "        c = a+ 2*(d-a)/3\n",

From fefef63df44af4b021bfbf6a336c2134a6a93029 Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Thu, 30 Nov 2023 16:46:38 +0100
Subject: [PATCH 18/18] Update CM20315_Transformers.ipynb

---
 CM20315/CM20315_Transformers.ipynb | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/CM20315/CM20315_Transformers.ipynb b/CM20315/CM20315_Transformers.ipynb
index 9679cf7..7f1ecc2 100644
--- a/CM20315/CM20315_Transformers.ipynb
+++ b/CM20315/CM20315_Transformers.ipynb
@@ -175,7 +175,7 @@
     {
       "cell_type": "code",
       "source": [
-        "# TODO Modify the code below by changeing the number of tokens generated and the initial sentence\n",
+        "# TODO Modify the code below by changing the number of tokens generated and the initial sentence\n",
         "# to get a feel for how well this works.  Since I didn't reset the seed, it will give a different\n",
         "# answer every time that you run it.\n",
         "\n",
@@ -253,7 +253,7 @@
     {
       "cell_type": "code",
       "source": [
-        "# TODO Modify the code below by changeing the number of tokens generated and the initial sentence\n",
+        "# TODO Modify the code below by changing the number of tokens generated and the initial sentence\n",
         "# to get a feel for how well this works.  \n",
         "\n",
         "# TODO Experiment with changing this line:\n",
@@ -471,7 +471,7 @@
     {
       "cell_type": "code",
       "source": [
-        "# This routine reutnrs the k'th most likely next token.\n",
+        "# This routine returns the k'th most likely next token.\n",
         "# If k =0 then it returns the most likely token, if k=1 it returns the next most likely and so on\n",
         "# We will need this for beam search\n",
         "def get_kth_most_likely_token(input_tokens, model, tokenizer, k):\n",