#!/usr/bin/env python
# coding: utf-8

# # Chapter 5: Bayesian Model fitting I

# # Exercises

# ## Exercise 1 (20 points)
# 
# We observe $r$ heads in $n$ tosses, and we adopt a beta prior with parameters ($\alpha_{prior}$ , $\beta_{prior}$).
# What is the expectation value (the mean) of $p$ and what is its most likely value (the mode), for
# 
# (a) the general case of this beta prior, and
# 
# (b) a uniform prior (a special case of the beta prior, as explained in the script.)
# 
# What form does the posterior PDF have if you have no data, and what is its mean and mode, for
# 
# (c) the general case of this beta prior, and
# 
# (d) the special case of the uniform prior. 
# 
# (e) Discuss briefly whether the mean or the mode is to be preferred as an estimator.

# ## Exercise 2 (20 points)

# A production chain of electronic components produces items that have a probability $p$ of being defective. The chain manager does not know $p$, but from past experience and quality assesments, the manager expects this probability to be equal to  $4\%$. In addition, the manager found some uncertainty about $p$ and assumes a standard deviation of  $2\%$. 
# 
# The manager comes to you, expert in statistics, and after discussion you convince the manager to use a Beta distribution to model the uncertainty about  $p$. 
# 
# 1) Explain why you think a Beta prior is reasonable or not.
# 
# 2) What values do you need to set the two parameters of the distribution in order to match the manager's priors about the expected value and the standard deviation of $p$?
# 
# After choosing the parameters of the Beta distribution so as to represent the priors about the probability of producing a defective item, the manager now wants to get a better estimate of the posterior pdf of $p$ by observing new data. He decides to inspect a production lot of $100$ items, and finds that $3$ of the items in the lot are defective. 
# 
# 3) How should the manager change the parameters of the Beta distribution in order to take this new information into account?
# 
# 4) After updating the parameters of the Beta distribution, what are the new expected value and the new standard deviation of the probability of finding a defective item?

# ## Exercise 3 (20 points)
# 
# You have a 3D posterior PDF, $p(\omega, \mu, v_r)$, over the parallax, $\omega$, proper motion, $\mu$, and
# radial velocity, $v_r$ of a star. Write down an expression of the 3D posterior of this in terms of the
# distance, $r$, tangential velocity $v_t$, and radial velocity, $v_r$, where $r = 1/\omega$ and $v_t = a\,\mu/\omega$ where a is $a$ known constant. Check your result by checking the dimensions (probabilty is dimensionless).

# ## Exercice 4 (40 points)
# 
# 
# A lighthouse is somewhere off a piece of straight coastline at a position $x$ along the shore and a distance $y$ out at sea. It emits short flashes at random times and hence at random azimuths, $\theta$, from its position, i.e., $p(\theta) = constant$. You see these flashes on the coast while walking along the shore. You record your position $D_k$ along the coastline at the instant you see a flash, but you do not record the direction the flash came from.
# 
# * You will find the data, $\{D_k\}$, in the file `lighthouse.dat`. 
# * We suppose all distances to be in $km$.
# 
# We will suppose to limit the exploration space that the lighthouse is to be somewhere in a rectangle $-2 < x < 2$ , $0 < y < 2$ , with uniform prior where $x=0$ coincides with $D=0$. We take the origin in distance at the coast, and simply center the box of interests along the coastline.
# 
# Below is a schematic to help you understand the situation.

# In[4]:


from IPython.display import Image
Image('lh_schema.png')


# **Schematics**: A lighthouse at a position $(x,y)$ emitted $n$ flashes observed at $D_k$ on coast (indicated by star symbols). Our prior imposes the lighthouse to be within a rectangular area aligned with the coast. 

# 1. Work out an analytic expression for the posterior probability distribution of the position of the lighthouse, $(x,y)$. State your assumptions at each of the following guided steps:
#     1. Given the geometry of the problem, write down the probability on the azimuth angle, $\theta_k$, of emission of a $k$th-flash from the lighthouse that can be visible on the coast, $p(\theta_k|x, y)$ .
# 
#     2. Work out the relation of $\theta_k$ with $D_k$ through simple geometric considerations.
# 
#     3. Deduce an analytic expression for the probability of seeing a flash at $D_k$ given the (unknown) position of the lighthouse (i.e. the likelihood).
# 
#     4. Finally, write down the posterior probability distribution of the position $(x,y)$ of the lighthouse given the ensemble of the measurements $\{D_k\}$. Give an expression for your normalization constant (but you don't need to do the integration).
#     
# 2. Inferring the position of the lighthouse from the data involves the estimation of both $x$ and $y$. The full procedure is beyond the scope of this chapter and methods will be learned in later chapters. We will therefore assume that the position along the coast is known, $x=1.25$ and reduce it to a single parameter example. State your assumptions at each of the following guided steps to find the distance between the coast and the lighthouse:
# 
#     1. Write down the posterior distribution of $p(y | \{D_k\}, x)$, and differentiate it with respect to $y$. It leads to a condition that is not easily solvable analytically. However, there is nothing to stop us tackling the problem numerically. 
# 
#     2. The most straightforward method is to use brute force and ignorance: gridding the values. Grid the $y$ 1d-space allowed by your prior with a sensible step size and compute at each point the posterior value $p(y |\{D_k\}, x)$. Be careful of the influence from your stepsize, make a sensible choice. The normalization of the posterior will be computed numerically on that grid. Plot the posterior. Locate and report the maximum of this function $y_{max}$.
#     
#     3. Numerically compute the expectation value $E[y]$, and the standard deviation $\sigma_y$ of the posterior. On top of your posterior distribution of $y$, plot a Gaussian of mean $E[y]$ and standard deviation $\sigma_y$. Does this agrees with $y_{max}$, given the width of the distribution?
#     
#     4. Make a quadratic approximation of the posterior distribution $p(y |\{D_k\}, x)$, around the peak value $(x, y_{max})$. Show your working and report the parameters of this approximation. Compare your result with the ones from the previous question (2.3). Why might they differ?