Update CM20315_Loss_III.ipynb
This commit is contained in:
@@ -33,7 +33,7 @@
|
|||||||
"# Loss functions part III\n",
|
"# Loss functions part III\n",
|
||||||
"\n",
|
"\n",
|
||||||
"This practical investigates loss functions. In part I we investigated univariate regression (where the output data $y$ is continuous. Our formulation was based on the normal/Gaussian distribution.\n",
|
"This practical investigates loss functions. In part I we investigated univariate regression (where the output data $y$ is continuous. Our formulation was based on the normal/Gaussian distribution.\n",
|
||||||
"In part II we investigated binary classification (where the output data is 0 or 1). This will be based on the Bernouilli distribution.<br><br>\n",
|
"In part II we investigated binary classification (where the output data is 0 or 1). This will be based on the Bernoulli distribution.<br><br>\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Now we'll investigate multiclass classification (where the outputs data can take multiple values 1,... K, which is based on the categorical distribution\n",
|
"Now we'll investigate multiclass classification (where the outputs data can take multiple values 1,... K, which is based on the categorical distribution\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -218,7 +218,7 @@
|
|||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"source": [
|
"source": [
|
||||||
"The left is model output and the right is the model output after the softmax has been applied, so it now lies in the range [0,1] and represents the probabiilty, that y=0 (red), 1 (green) and 2 (blue) The dots at the bottom show the training data with the same color scheme. So we want the red curve to be high where there are red dots, the green curve to be high where there are green dotsmand the blue curve to be high where there are blue dots We'll compute the the likelihood and the negative log likelihood."
|
"The left is model output and the right is the model output after the softmax has been applied, so it now lies in the range [0,1] and represents the probability, that y=0 (red), 1 (green) and 2 (blue) The dots at the bottom show the training data with the same color scheme. So we want the red curve to be high where there are red dots, the green curve to be high where there are green dotsmand the blue curve to be high where there are blue dots We'll compute the the likelihood and the negative log likelihood."
|
||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"id": "MvVX6tl9AEXF"
|
"id": "MvVX6tl9AEXF"
|
||||||
@@ -228,7 +228,7 @@
|
|||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"source": [
|
"source": [
|
||||||
"# Return probability under Bernoulli distribution for input x\n",
|
"# Return probability under Bernoulli distribution for input x\n",
|
||||||
"# Complicated code to commpute it but just take value from row k of lambda param where y =k, \n",
|
"# Complicated code to compute it but just take value from row k of lambda param where y =k, \n",
|
||||||
"def categorical_distribution(y, lambda_param):\n",
|
"def categorical_distribution(y, lambda_param):\n",
|
||||||
" prob = np.zeros_like(y)\n",
|
" prob = np.zeros_like(y)\n",
|
||||||
" for row_index in range(lambda_param.shape[0]):\n",
|
" for row_index in range(lambda_param.shape[0]):\n",
|
||||||
@@ -305,7 +305,7 @@
|
|||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"source": [
|
"source": [
|
||||||
"You can see that this gives a very small answer, even for this small 1D dataset, and with the model fitting quite well. This is because it is the product of sveral probabilities, which are all quite small themselves.\n",
|
"You can see that this gives a very small answer, even for this small 1D dataset, and with the model fitting quite well. This is because it is the product of several probabilities, which are all quite small themselves.\n",
|
||||||
"This will get out of hand pretty quickly with real datasets -- the likelihood will get so small that we can't represent it with normal finite-precision math\n",
|
"This will get out of hand pretty quickly with real datasets -- the likelihood will get so small that we can't represent it with normal finite-precision math\n",
|
||||||
"\n",
|
"\n",
|
||||||
"This is why we use negative log likelihood"
|
"This is why we use negative log likelihood"
|
||||||
@@ -338,7 +338,7 @@
|
|||||||
"beta_0, omega_0, beta_1, omega_1 = get_parameters()\n",
|
"beta_0, omega_0, beta_1, omega_1 = get_parameters()\n",
|
||||||
"# Use our neural network to predict the mean of the Gaussian\n",
|
"# Use our neural network to predict the mean of the Gaussian\n",
|
||||||
"model_out = shallow_nn(x_train, beta_0, omega_0, beta_1, omega_1)\n",
|
"model_out = shallow_nn(x_train, beta_0, omega_0, beta_1, omega_1)\n",
|
||||||
"# Set the standard devation to something reasonable\n",
|
"# Set the standard deviation to something reasonable\n",
|
||||||
"lambda_train = softmax(model_out)\n",
|
"lambda_train = softmax(model_out)\n",
|
||||||
"# Compute the log likelihood\n",
|
"# Compute the log likelihood\n",
|
||||||
"nll = compute_negative_log_likelihood(y_train, lambda_train)\n",
|
"nll = compute_negative_log_likelihood(y_train, lambda_train)\n",
|
||||||
@@ -365,7 +365,7 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"# Define a range of values for the parameter\n",
|
"# Define a range of values for the parameter\n",
|
||||||
"beta_1_vals = np.arange(-2,6.0,0.1)\n",
|
"beta_1_vals = np.arange(-2,6.0,0.1)\n",
|
||||||
"# Create some arrays to store the likelihoods, negative log likehoods\n",
|
"# Create some arrays to store the likelihoods, negative log likelihoods\n",
|
||||||
"likelihoods = np.zeros_like(beta_1_vals)\n",
|
"likelihoods = np.zeros_like(beta_1_vals)\n",
|
||||||
"nlls = np.zeros_like(beta_1_vals)\n",
|
"nlls = np.zeros_like(beta_1_vals)\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
|||||||
Reference in New Issue
Block a user