Update CM20315_Loss.ipynb

This commit is contained in:
Pietro Monticone
2023-11-30 16:42:27 +01:00
parent a5d98bb379
commit 6b76bbc7c3

View File

@@ -36,7 +36,7 @@
"\n", "\n",
"We'll compute loss functions for maximum likelihood, minimum negative log likelihood, and least squares and show that they all imply that we should use the same parameter values\n", "We'll compute loss functions for maximum likelihood, minimum negative log likelihood, and least squares and show that they all imply that we should use the same parameter values\n",
"\n", "\n",
"In part II, we'll investigate binary classification (where the output data is 0 or 1). This will be based on the Bernouilli distribution\n", "In part II, we'll investigate binary classification (where the output data is 0 or 1). This will be based on the Bernoulli distribution\n",
"\n", "\n",
"In part III we'll investigate multiclass classification (where the output data is 0,1, or, 2). This will be based on the categorical distribution." "In part III we'll investigate multiclass classification (where the output data is 0,1, or, 2). This will be based on the categorical distribution."
], ],
@@ -178,7 +178,7 @@
{ {
"cell_type": "markdown", "cell_type": "markdown",
"source": [ "source": [
"The blue line i sthe mean prediction of the model and the gray area represents plus/minus two standardard deviations. This model fits okay, but could be improved. Let's compute the loss. We'll compute the the least squares error, the likelihood, the negative log likelihood." "The blue line is the mean prediction of the model and the gray area represents plus/minus two standard deviations. This model fits okay, but could be improved. Let's compute the loss. We'll compute the the least squares error, the likelihood, the negative log likelihood."
], ],
"metadata": { "metadata": {
"id": "MvVX6tl9AEXF" "id": "MvVX6tl9AEXF"
@@ -276,7 +276,7 @@
"beta_0, omega_0, beta_1, omega_1 = get_parameters()\n", "beta_0, omega_0, beta_1, omega_1 = get_parameters()\n",
"# Use our neural network to predict the mean of the Gaussian\n", "# Use our neural network to predict the mean of the Gaussian\n",
"mu_pred = shallow_nn(x_train, beta_0, omega_0, beta_1, omega_1)\n", "mu_pred = shallow_nn(x_train, beta_0, omega_0, beta_1, omega_1)\n",
"# Set the standard devation to something reasonable\n", "# Set the standard deviation to something reasonable\n",
"sigma = 0.2\n", "sigma = 0.2\n",
"# Compute the likelihood\n", "# Compute the likelihood\n",
"likelihood = compute_likelihood(y_train, mu_pred, sigma)\n", "likelihood = compute_likelihood(y_train, mu_pred, sigma)\n",
@@ -292,7 +292,7 @@
{ {
"cell_type": "markdown", "cell_type": "markdown",
"source": [ "source": [
"You can see that this gives a very small answer, even for this small 1D dataset, and with the model fitting quite well. This is because it is the product of sveral probabilities, which are all quite small themselves.\n", "You can see that this gives a very small answer, even for this small 1D dataset, and with the model fitting quite well. This is because it is the product of several probabilities, which are all quite small themselves.\n",
"This will get out of hand pretty quickly with real datasets -- the likelihood will get so small that we can't represent it with normal finite-precision math\n", "This will get out of hand pretty quickly with real datasets -- the likelihood will get so small that we can't represent it with normal finite-precision math\n",
"\n", "\n",
"This is why we use negative log likelihood" "This is why we use negative log likelihood"
@@ -326,7 +326,7 @@
"beta_0, omega_0, beta_1, omega_1 = get_parameters()\n", "beta_0, omega_0, beta_1, omega_1 = get_parameters()\n",
"# Use our neural network to predict the mean of the Gaussian\n", "# Use our neural network to predict the mean of the Gaussian\n",
"mu_pred = shallow_nn(x_train, beta_0, omega_0, beta_1, omega_1)\n", "mu_pred = shallow_nn(x_train, beta_0, omega_0, beta_1, omega_1)\n",
"# Set the standard devation to something reasonable\n", "# Set the standard deviation to something reasonable\n",
"sigma = 0.2\n", "sigma = 0.2\n",
"# Compute the log likelihood\n", "# Compute the log likelihood\n",
"nll = compute_negative_log_likelihood(y_train, mu_pred, sigma)\n", "nll = compute_negative_log_likelihood(y_train, mu_pred, sigma)\n",
@@ -397,7 +397,7 @@
"source": [ "source": [
"# Define a range of values for the parameter\n", "# Define a range of values for the parameter\n",
"beta_1_vals = np.arange(0,1.0,0.01)\n", "beta_1_vals = np.arange(0,1.0,0.01)\n",
"# Create some arrays to store the likelihoods, negative log likehoos and sum of squares\n", "# Create some arrays to store the likelihoods, negative log likelihoods and sum of squares\n",
"likelihoods = np.zeros_like(beta_1_vals)\n", "likelihoods = np.zeros_like(beta_1_vals)\n",
"nlls = np.zeros_like(beta_1_vals)\n", "nlls = np.zeros_like(beta_1_vals)\n",
"sum_squares = np.zeros_like(beta_1_vals)\n", "sum_squares = np.zeros_like(beta_1_vals)\n",
@@ -482,7 +482,7 @@
"source": [ "source": [
"# Define a range of values for the parameter\n", "# Define a range of values for the parameter\n",
"sigma_vals = np.arange(0.1,0.5,0.005)\n", "sigma_vals = np.arange(0.1,0.5,0.005)\n",
"# Create some arrays to store the likelihoods, negative log likehoos and sum of squares\n", "# Create some arrays to store the likelihoods, negative log likelihoods and sum of squares\n",
"likelihoods = np.zeros_like(sigma_vals)\n", "likelihoods = np.zeros_like(sigma_vals)\n",
"nlls = np.zeros_like(sigma_vals)\n", "nlls = np.zeros_like(sigma_vals)\n",
"sum_squares = np.zeros_like(sigma_vals)\n", "sum_squares = np.zeros_like(sigma_vals)\n",