cd(@__DIR__)
using Pkg
Pkg.activate("hw10")
using Statistics
using Distributions
using LinearAlgebra
using KernelDensity
using Plots
Activating project at `~/Documents/GitHub/STAT638_Applied-Bayes-Methods/hw/hw10`
Source code is shown here: https://stchiu.quarto.pub/stat638__hw10/
cd(@__DIR__)
using Pkg
Pkg.activate("hw10")
using Statistics
using Distributions
using LinearAlgebra
using KernelDensity
using Plots
Activating project at `~/Documents/GitHub/STAT638_Applied-Bayes-Methods/hw/hw10`
Pkg.status()
VERSION
Status `~/Documents/GitHub/STAT638_Applied-Bayes-Methods/hw/hw10/Project.toml`
[31c24e10] Distributions v0.25.79
[5ab0869b] KernelDensity v0.6.5
[91a5bcdd] Plots v1.36.3
[10745b16] Statistics
v"1.8.2"
Assume we have \(4\) observations, \((-1,0,1,10)\), where the last observation can be thought of as an outlier. Assume that conditional on an unknown parameter \(\theta\), the data area i.i.d. from some population distribution. Assume a standard normal prior for \(\theta\).
First, assume that the population distribution is a normal distribution with mean \(\theta\) and variance \(1\). Draw samples from the posterior of \(\theta\) using a Metropolis algorithm and also derive the exact posterior in closed form.
Exact
\[\theta \sim Normal(\mu=0, \tau^{2}=1)\] \[Y \sim Normal(\theta, \sigma^2 = 1)\]
\[\begin{align} P(\theta | y, \sigma^2)% &\sim Normal(\mu_n, \tau_{n}^2) \end{align}\]
= [-1., 0., 1., 10.]
ys
= Normal()
θ = 300000
k = 1.
δ²
function sampling_ma(like_dist, θ, k, δ²)
= zeros(k)
θs = 0.
th for i in 1:k
= th
θᵒ = Normal(θᵒ, δ²)
J = rand(J)
θʷ = sum(logpdf.( like_dist(θʷ, 1), ys) .- logpdf.( like_dist(θᵒ, 1), ys)) + logpdf(θ, θʷ) - logpdf(θ, θᵒ)
r_log
# Accept
= log(rand(Uniform()))
u_log if r_log > u_log
= θʷ
θs[i] else
= θᵒ
θs[i] end
= θs[i]
th end
return θs
end
= sampling_ma(Normal, θ, k, δ²)
θs_n
# Exact PDF
= length(ys)
n =( 1 / (n/ 1 + 1/ 1)^0.5)
τₙ = mean(ys) * (n / 1) / (n/ 1 + 1/1) + 0
μₙ = Normal(μₙ, τₙ )
θ_exact = collect(-3:0.01:4)
xs = pdf.(θ_exact, xs)
ys_exact
# Display
= histogram(θs_n, normalize=:pdf, xlabel="θ", ylabel="P(θ|y)", label="Metropolis", title="Normal Model")
p plot!(p, xs, ys_exact, label="Exact PDF")
Now assume the population distribution is a Cauchy distribution with location parameter \(\theta\) and scale \(1\). (This is equivalent to a nonstandardized t distribution with one degree of freedom and location parameter \(\theta\).) Draw samples from the posterior using the Metropolis algorithm.
= sampling_ma(Cauchy, θ, k, δ²)
θs_c = histogram(θs_c, normalize=:pdf, xlabel="θ", ylabel="P(θ|y)", label="Metropolis", title="Cauchy Model")
p2
display(p2)
Plot the exact posterior density from part \(1\), together with kernel density estimates from the two Metropolis samplers. Describe how the outlier has affected the posteriors.
Cauchy distribution as likelihood distribution is less sensitive to the outliers than Normal distribution.
= kde(θs_n)
Un = kde(θs_c)
Uc = [pdf(Un, x) for x in xs]
pdf_n = [pdf(Uc, x) for x in xs]
pdf_c = plot(xlabel="θ", ylabel="P(θ|y)", title="Pop dist. of Cuachy and Normal")
p3 plot!(p3, xs, pdf_n, label="Normal")
plot!(p3, xs, pdf_c, label="Cauchy")
plot!(p3, xs, ys_exact, label="Exact PDF (Normal)")
scatter!(p3, ys, zeros(length(ys)), label="data")
display(p3)