udlbook/Notebooks/Chap17/17_3_Importance_Sampling.ipynb

{
  "cells": [
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "view-in-github"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/udlbook/udlbook/blob/main/Notebooks/Chap17/17_3_Importance_Sampling.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "t9vk9Elugvmi"
      },
      "source": [
        "# **Notebook 17.3: Importance sampling**\n",
        "\n",
        "This notebook investigates importance sampling as described in section 17.8.1 of the book.\n",
        "\n",
        "Work through the cells below, running each cell in turn. In various places you will see the words \"TODO\". Follow the instructions at these places and make predictions about what is going to happen or write code to complete the functions.\n",
        "\n",
        "Contact me at udlbookmail@gmail.com if you find any mistakes or have any suggestions."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "OLComQyvCIJ7"
      },
      "outputs": [],
      "source": [
        "import numpy as np\n",
        "import matplotlib.pyplot as plt"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "f7a6xqKjkmvT"
      },
      "source": [
        "Let's approximate the expectation\n",
        "\n",
        "\\begin{equation}\n",
        "\\mathbb{E}_{y}\\Bigl[\\exp\\bigl[- (y-1)^4\\bigr]\\Bigr] = \\int \\exp\\bigl[- (y-1)^4\\bigr] Pr(y) dy,\n",
        "\\end{equation}\n",
        "\n",
        "where\n",
        "\n",
        "\\begin{equation}\n",
        "Pr(y)=\\text{Norm}_y[0,1]\n",
        "\\end{equation}\n",
        "\n",
        "by drawing $I$ samples $y_i$ and using the formula:\n",
        "\n",
        "\\begin{equation}\n",
        "\\mathbb{E}_{y}\\Bigl[\\exp\\bigl[- (y-1)^4\\bigr]\\Bigr] \\approx \\frac{1}{I} \\sum_{i=1}^I \\exp\\bigl[-(y_i-1)^4 \\bigr]\n",
        "\\end{equation}"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "VjkzRr8o2ksg"
      },
      "outputs": [],
      "source": [
        "def f(y):\n",
        "  return np.exp(-(y-1) *(y-1) *(y-1) * (y-1))\n",
        "\n",
        "\n",
        "def pr_y(y):\n",
        "  return (1/np.sqrt(2*np.pi)) * np.exp(-0.5 * y * y)\n",
        "\n",
        "fig,ax = plt.subplots()\n",
        "y = np.arange(-10,10,0.01)\n",
        "ax.plot(y, f(y), 'r-', label='f$[y]$');\n",
        "ax.plot(y, pr_y(y),'b-',label='$Pr(y)$')\n",
        "ax.set_xlabel(\"$y$\")\n",
        "ax.legend()\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "LGAKHjUJnWmy"
      },
      "outputs": [],
      "source": [
        "def compute_expectation(n_samples):\n",
        "  # TODO -- compute this expectation\n",
        "  # 1. Generate samples y_i using np.random.normal\n",
        "  # 2. Approximate the expectation of f[y]\n",
        "  # Replace this line\n",
        "  expectation = 0\n",
        "\n",
        "\n",
        "  return expectation"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "nmvixMqgodIP"
      },
      "outputs": [],
      "source": [
        "# Set the seed so the random numbers are all the same\n",
        "np.random.seed(0)\n",
        "\n",
        "# Compute the expectation  with a very large number of samples (good estimate)\n",
        "n_samples = 100000000\n",
        "expected_f= compute_expectation(n_samples)\n",
        "print(\"Your value: \", expected_f, \", True value:  0.43160702267383166\")"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "Jr4UPcqmnXCS"
      },
      "source": [
        "Let's investigate how the variance of this approximation decreases as we increase the number of samples $N$.\n",
        "\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "yrDp1ILUo08j"
      },
      "outputs": [],
      "source": [
        "def compute_mean_variance(n_sample):\n",
        "  n_estimate = 10000\n",
        "  estimates = np.zeros((n_estimate,1))\n",
        "  for i in range(n_estimate):\n",
        "    estimates[i] = compute_expectation(n_sample.astype(int))\n",
        "  return np.mean(estimates), np.var(estimates)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "BcUVsodtqdey"
      },
      "outputs": [],
      "source": [
        "# Compute the mean and variance for 1,2,... 20 samples\n",
        "n_sample_all = np.array([1.,2,3,4,5,6,7,8,9,10,15,20,25,30,45,50,60,70,80,90,100,150,200,250,300,350,400,450,500])\n",
        "mean_all = np.zeros_like(n_sample_all)\n",
        "variance_all = np.zeros_like(n_sample_all)\n",
        "for i in range(len(n_sample_all)):\n",
        "  print(\"Computing mean and variance for expectation with %d samples\"%(n_sample_all[i]))\n",
        "  mean_all[i],variance_all[i] = compute_mean_variance(n_sample_all[i])"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "feXmyk0krpUi"
      },
      "outputs": [],
      "source": [
        "fig,ax = plt.subplots()\n",
        "ax.semilogx(n_sample_all, mean_all,'r-',label='mean estimate')\n",
        "ax.fill_between(n_sample_all, mean_all-2*np.sqrt(variance_all), mean_all+2*np.sqrt(variance_all))\n",
        "ax.set_xlabel(\"Number of samples\")\n",
        "ax.set_ylabel(\"Mean of estimate\")\n",
        "ax.plot([0,500],[0.43160702267383166,0.43160702267383166],'k--',label='true value')\n",
        "ax.legend()\n",
        "plt.show()\n"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "XTUpxFlSuOl7"
      },
      "source": [
        "As you might expect, the more samples that we use to compute the approximate estimate, the lower the variance of the estimate."
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "6hxsl3Pxo1TT"
      },
      "source": [
        " Now consider the function\n",
        " \\begin{equation}\n",
        " \\mbox{f}[y]= 20.446\\exp\\left[-(y-3)^4\\right],\n",
        " \\end{equation}\n",
        "\n",
        "which decreases rapidly as we move away from the position $y=3$."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "znydVPW7sL4P"
      },
      "outputs": [],
      "source": [
        "def f2(y):\n",
        "  return 20.446*np.exp(- (y-3) *(y-3) *(y-3) * (y-3))\n",
        "\n",
        "fig,ax = plt.subplots()\n",
        "y = np.arange(-10,10,0.01)\n",
        "ax.plot(y, f2(y), 'r-', label='f$[y]$');\n",
        "ax.plot(y, pr_y(y),'b-',label='$Pr(y)$')\n",
        "ax.set_xlabel(\"$y$\")\n",
        "ax.legend()\n",
        "plt.show()"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "G9Xxo0OJsIqD"
      },
      "source": [
        "Let's again, compute the expectation:\n",
        "\n",
        "\\begin{align}\n",
        "\\mathbb{E}_{y}\\left[\\text{f}[y]\\right] &=& \\int \\text{f}[y] Pr(y) dy\\\\\n",
        "&\\approx& \\frac{1}{I} \\text{f}[y]\n",
        "\\end{align}\n",
        "\n",
        "where $Pr(y)=\\text{Norm}_y[0,1]$ by approximating with samples $y_{i}$.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "l8ZtmkA2vH4y"
      },
      "outputs": [],
      "source": [
        "def compute_expectation2(n_samples):\n",
        "  y = np.random.normal(size=(n_samples,1))\n",
        "  expectation = np.mean(f2(y))\n",
        "\n",
        "  return expectation"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "dfUQyJ-svZ6F"
      },
      "outputs": [],
      "source": [
        "# Set the seed so the random numbers are all the same\n",
        "np.random.seed(0)\n",
        "\n",
        "# Compute the expectation with a very large number of samples (good estimate)\n",
        "n_samples = 100000000\n",
        "expected_f2= compute_expectation2(n_samples)\n",
        "print(\"Expected value: \", expected_f2)"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "2sVDqP0BvxqM"
      },
      "source": [
        "I deliberately chose this function, because it's expectation is roughly the same as for the previous function.\n",
        "\n",
        "Again, let's look at the mean and the  variance of the estimates"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "mHnILRkOv0Ir"
      },
      "outputs": [],
      "source": [
        "def compute_mean_variance2(n_sample):\n",
        "  n_estimate = 10000\n",
        "  estimates = np.zeros((n_estimate,1))\n",
        "  for i in range(n_estimate):\n",
        "    estimates[i] = compute_expectation2(n_sample.astype(int))\n",
        "  return np.mean(estimates), np.var(estimates)\n",
        "\n",
        "# Compute the variance for 1,2,... 20 samples\n",
        "mean_all2 = np.zeros_like(n_sample_all)\n",
        "variance_all2 = np.zeros_like(n_sample_all)\n",
        "for i in range(len(n_sample_all)):\n",
        "  print(\"Computing variance for expectation with %d samples\"%(n_sample_all[i]))\n",
        "  mean_all2[i], variance_all2[i] = compute_mean_variance2(n_sample_all[i])"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "FkCX-hxxAnsw"
      },
      "outputs": [],
      "source": [
        "fig,ax1 = plt.subplots()\n",
        "ax1.semilogx(n_sample_all, mean_all,'r-',label='mean estimate')\n",
        "ax1.fill_between(n_sample_all, mean_all-2*np.sqrt(variance_all), mean_all+2*np.sqrt(variance_all))\n",
        "ax1.set_xlabel(\"Number of samples\")\n",
        "ax1.set_ylabel(\"Mean of estimate\")\n",
        "ax1.plot([1,500],[0.43160702267383166,0.43160702267383166],'k--',label='true value')\n",
        "ax1.set_ylim(-5,6)\n",
        "ax1.set_title(\"First function\")\n",
        "ax1.legend()\n",
        "\n",
        "fig2,ax2 = plt.subplots()\n",
        "ax2.semilogx(n_sample_all, mean_all2,'r-',label='mean estimate')\n",
        "ax2.fill_between(n_sample_all, mean_all2-2*np.sqrt(variance_all2), mean_all2+2*np.sqrt(variance_all2))\n",
        "ax2.set_xlabel(\"Number of samples\")\n",
        "ax2.set_ylabel(\"Mean of estimate\")\n",
        "ax2.plot([0,500],[0.43160428638892556,0.43160428638892556],'k--',label='true value')\n",
        "ax2.set_ylim(-5,6)\n",
        "ax2.set_title(\"Second function\")\n",
        "ax2.legend()\n",
        "plt.show()"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "EtBP6NeLwZqz"
      },
      "source": [
        "You can see that the variance of the estimate of the second function is considerably worse than the estimate of the variance of estimate of the first function\n",
        "\n",
        "TODO:  Think about why this is."
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "_wuF-NoQu1--"
      },
      "source": [
        " Now let's repeat this experiment with the second function, but this time use importance sampling with auxiliary distribution:\n",
        "\n",
        " \\begin{equation}\n",
        "   q(y)=\\text{Norm}_{y}[3,1]\n",
        " \\end{equation}\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "jPm0AVYVIDnn"
      },
      "outputs": [],
      "source": [
        "def q_y(y):\n",
        "  return (1/np.sqrt(2*np.pi)) * np.exp(-0.5 * (y-3) * (y-3))\n",
        "\n",
        "def compute_expectation2b(n_samples):\n",
        "  # TODO -- complete this function\n",
        "  # 1. Draw n_samples from auxiliary distribution\n",
        "  # 2. Compute f2[y] for those samples\n",
        "  # 3. Scale the results by pr_y / q_y\n",
        "  # 4. Compute the mean of these weighted samples\n",
        "  # Replace this line\n",
        "  expectation = 0\n",
        "\n",
        "  return expectation"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "No2ByVvOM2yQ"
      },
      "outputs": [],
      "source": [
        "# Set the seed so the random numbers are all the same\n",
        "np.random.seed(0)\n",
        "\n",
        "# Compute the expectation with a very large number of samples (good estimate)\n",
        "n_samples = 100000000\n",
        "expected_f2= compute_expectation2b(n_samples)\n",
        "print(\"Your value: \", expected_f2,\", True value:  0.43163734204459125 \")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "6v8Jc7z4M3Mk"
      },
      "outputs": [],
      "source": [
        "def compute_mean_variance2b(n_sample):\n",
        "  n_estimate = 10000\n",
        "  estimates = np.zeros((n_estimate,1))\n",
        "  for i in range(n_estimate):\n",
        "    estimates[i] = compute_expectation2b(n_sample.astype(int))\n",
        "  return np.mean(estimates), np.var(estimates)\n",
        "\n",
        "# Compute the variance for 1,2,... 20 samples\n",
        "mean_all2b = np.zeros_like(n_sample_all)\n",
        "variance_all2b = np.zeros_like(n_sample_all)\n",
        "for i in range(len(n_sample_all)):\n",
        "  print(\"Computing variance for expectation with %d samples\"%(n_sample_all[i]))\n",
        "  mean_all2b[i], variance_all2b[i] = compute_mean_variance2b(n_sample_all[i])"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "C0beD4sNNM3L"
      },
      "outputs": [],
      "source": [
        "fig,ax1 = plt.subplots()\n",
        "ax1.semilogx(n_sample_all, mean_all,'r-',label='mean estimate')\n",
        "ax1.fill_between(n_sample_all, mean_all-2*np.sqrt(variance_all), mean_all+2*np.sqrt(variance_all))\n",
        "ax1.set_xlabel(\"Number of samples\")\n",
        "ax1.set_ylabel(\"Mean of estimate\")\n",
        "ax1.plot([1,500],[0.43160702267383166,0.43160702267383166],'k--',label='true value')\n",
        "ax1.set_ylim(-5,6)\n",
        "ax1.set_title(\"First function\")\n",
        "ax1.legend()\n",
        "\n",
        "fig2,ax2 = plt.subplots()\n",
        "ax2.semilogx(n_sample_all, mean_all2,'r-',label='mean estimate')\n",
        "ax2.fill_between(n_sample_all, mean_all2-2*np.sqrt(variance_all2), mean_all2+2*np.sqrt(variance_all2))\n",
        "ax2.set_xlabel(\"Number of samples\")\n",
        "ax2.set_ylabel(\"Mean of estimate\")\n",
        "ax2.plot([0,500],[0.43160428638892556,0.43160428638892556],'k--',label='true value')\n",
        "ax2.set_ylim(-5,6)\n",
        "ax2.set_title(\"Second function\")\n",
        "ax2.legend()\n",
        "plt.show()\n",
        "\n",
        "fig2,ax2 = plt.subplots()\n",
        "ax2.semilogx(n_sample_all, mean_all2b,'r-',label='mean estimate')\n",
        "ax2.fill_between(n_sample_all, mean_all2b-2*np.sqrt(variance_all2b), mean_all2b+2*np.sqrt(variance_all2b))\n",
        "ax2.set_xlabel(\"Number of samples\")\n",
        "ax2.set_ylabel(\"Mean of estimate\")\n",
        "ax2.plot([0,500],[ 0.43163734204459125, 0.43163734204459125],'k--',label='true value')\n",
        "ax2.set_ylim(-5,6)\n",
        "ax2.set_title(\"Second function with importance sampling\")\n",
        "ax2.legend()\n",
        "plt.show()"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "y8rgge9MNiOc"
      },
      "source": [
        "You can see that the importance sampling technique has reduced the amount of variance for any given number of samples."
      ]
    }
  ],
  "metadata": {
    "colab": {
      "authorship_tag": "ABX9TyNecz9/CDOggPSmy1LjT/Dv",
      "include_colab_link": true,
      "provenance": []
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}