Bayesian Inference and Uncertainty Quantification for Inverse Problems

Bayesian Inference and Uncertainty Quantification for Inverse Problems Matt Moores @MooresMt https://uow.edu.au/~mmoores/ Centre for Environmental Informatics (CEI), University of Wollongong, NSW, Australia TIDE Hub Seminar Series joint with Matthew Berry, Mark Nelson, Brian Monaghan and Raymond Longbottom 1 / 19

Introduction Physical model F t; ~ θ Unknown parameters ~ θ = {θ1, . . . , θk} Initial conditions F0 and derivatives ∂F ∂t , etc. 2 / 19

Introduction Physical model F t; ~ θ Unknown parameters ~ θ = {θ1, . . . , θk} Initial conditions F0 and derivatives ∂F ∂t , etc. Observed data yt at times t = 1, . . . , T yt = F t; ~ θ + εt where εt is random noise. 2 / 19

Example Growth curve: dF dt = −βγt log γ 3 / 19

Example Growth curve: dF dt = −βγt log γ Solution: F ti; ~ θ = α − βγti at time points t1, . . . , tn with initial condition F0 = α − β. 3 / 19

Example Growth curve: dF dt = −βγt log γ Solution: F ti; ~ θ = α − βγti at time points t1, . . . , tn with initial condition F0 = α − β. Noisy observations: yi = F(ti; α, β, γ) + εi where εi ∼ N(0, σ2 ) is additive Gaussian noise. Unknown parameters ~ θ = {α, β, γ, σ2 } 3 / 19

Observed Data for n = 27 dugongs 4 / 19

Nonlinear Least Squares nlm - nls(y ~ alpha - beta * gammˆt, data=dat, start=list(alpha=1, beta=1, gamm=0.9)) summary(nlm) ## ## Formula: y ~ alpha - beta * gamm^t ## ## Parameters: ## Estimate Std. Error t value Pr(|t|) ## alpha 2.65807 0.06151 43.21 2e-16 *** ## beta 0.96352 0.06968 13.83 6.3e-13 *** ## gamm 0.87146 0.02460 35.42 2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.09525 on 24 degrees of freedom 5 / 19

Drawbacks No guarantee of convexity: will only find the nearest local optimum results are completely dependent on initialization 6 / 19

Drawbacks No guarantee of convexity: will only find the nearest local optimum results are completely dependent on initialization Standard errors are underestimated: 95% profile CI for α and β only achieve ~83% coverage 6 / 19

Inverse Problem Find parameters ~ θ consistent with ~ y 7 / 19

Inverse Problem Find parameters ~ θ consistent with ~ y Physical model F : Θ → Y doesn’t need to be invertible: although lack of identifiability for ~ θ can create problems we don’t need F in closed form — can use numeric solver for DE 7 / 19

Inverse Problem Find parameters ~ θ consistent with ~ y Physical model F : Θ → Y doesn’t need to be invertible: although lack of identifiability for ~ θ can create problems we don’t need F in closed form — can use numeric solver for DE Likelihood L(~ y | ~ θ) is based on E[Yi | ~ θ] = F ti; ~ θ as well as distribution of random noise (not necessarily Gaussian) 7 / 19

Bayesian Inference We assign prior distributions to the parameters ~ θ: π(log{α}) ∼ N(0, 100) π(log{β}) ∼ N(0, 100) π(logit{γ}) ∼ N(0, 100) π 1 σ2 ∼ Ga ν0 2 , 2 ν0s2 0 ! 8 / 19 Astfalck, Cripps, Gosling, Hodkiewicz Milne (2018) Ocean Engineering 161: 268–276.

Bayesian Inference We assign prior distributions to the parameters ~ θ: π(log{α}) ∼ N(0, 100) π(log{β}) ∼ N(0, 100) π(logit{γ}) ∼ N(0, 100) π 1 σ2 ∼ Ga ν0 2 , 2 ν0s2 0 ! Then the joint posterior distribution is: π ~ θ | ~ y = L(~ y | ~ θ)π(~ θ) π(~ y) where the normalising constant π(~ y) = R ∞ 0 R ∞ 0 R ∞ 0 R 1 0 L(~ y | α, β, γ, σ) dπ(γ)dπ(β)dπ(α)dπ(σ) is intractable. 8 / 19 Astfalck, Cripps, Gosling, Hodkiewicz Milne (2018) Ocean Engineering 161: 268–276.

Markov Chain Monte Carlo Algorithm 1 Random-walk Metropolis sampler for π ~ θ | ~ y 1: Initialize α0 ∼ π(log{α}), β0 ∼ π(log{β}), γ0 ∼ π(logit{γ}), σ0 ∼ π 1 σ2 2: Solve ~ µ0 = ~ F ~ t; α0, β0, γ0 and calculate `0 = log L(~ y | ~ µ0, σ2 0) 3: for iterations j = 1, . . . , J do 4: Propose log{α∗} ∼ N(log{αj−1}, σ2 α), log{β∗} ∼ N(log{βj−1}, σ2 β), logit{γ∗} ∼ N(logit{γj−1}, σ2 γ) 5: Solve ~ µ∗ = ~ F ~ t; α∗, β∗, γ∗ 6: Propose σ∗ ∼ π 1 σ2 | ~ µ∗,~ y and calculate `∗ = log L(~ y | ~ µ∗, σ2 ∗) 7: Calculate ρt = exp{`∗}π(~ θ∗)q(~ θj−1 | ~ θ∗) exp{`j−1}π(~ θj−1)q(~ θ∗ | ~ θj−1) 8: Accept ~ θj = ~ θ∗ with probability min{ρt, 1} else ~ θj = ~ θj−1 9: end for 9 / 19

Posterior Distributions 11 / 19

95% Predictive Interval 12 / 19

Thermogravimetric Analysis The TGA model for a single reaction involves the Arrhenius equations: dM dt = −MA exp − E RT dT dt = α where M is mass fraction, T is temperature, R is the ideal gas constant. The initial mass M0, the initial temperature T0, and the heating rate α are experimentally controlled. The unknown parameters ~ θ are A pre-exponential factor E activation energy σ2 variance of the additive Gaussian noise 13 / 19

Reparameterisation We have good prior information for E (physically interpretable) and σ2 (measurement noise), but not for A. The distributions for A and E are also very highly correlated, slowing the mixing of MCMC. 14 / 19

Reparameterisation We have good prior information for E (physically interpretable) and σ2 (measurement noise), but not for A. The distributions for A and E are also very highly correlated, slowing the mixing of MCMC. Instead, we reparameterise the model in terms of the temperature, Tm, at which the rate dM dt is maximised: A exp − E RTm = E RT2 m α In our MCMC algorithm, we first propose q(E∗ | Ej−1), then propose q(Tm∗ | Tm,j−1). The value of A∗ can then be obtained from the equation above. We solve ~ F ~ t; E∗, A∗ numerically using a Runge-Kutta method. 14 / 19

Posterior Distributions 634.6 634.7 634.8 634.9 635.0 635.1 635.2 635.3 Tm 0 1 2 3 4 5 84750 85000 85250 85500 85750 86000 86250 86500 86750 E 0.0000 0.0002 0.0004 0.0006 0.0008 0.0010 0.0012 0.0014 0.0016 0.08 0.10 0.12 0.14 0 10 20 30 40 50 2400000 2600000 2800000 3000000 3200000 3400000 3600000 A 0.0000000 0.0000005 0.0000010 0.0000015 0.0000020 0.0000025 15 / 19

Posterior for Functions of ~ θ A function of a random variable is also a random variable. Now that we have random samples from our posterior π(A, E, Tm, σ2 | ~ y), we can use our model to obtain predictions for any measurable function of the parameters, g(A, E, Tm). 16 / 19

Posterior for Functions of ~ θ A function of a random variable is also a random variable. Now that we have random samples from our posterior π(A, E, Tm, σ2 | ~ y), we can use our model to obtain predictions for any measurable function of the parameters, g(A, E, Tm). For example, the critical length of a stockpile: Lcr = g(A, E, Tm) = K v u u texp n E RT2 m o A RT2 m E E[g(A, E, Tm) | ~ y] = Z 1000 0 Z ∞ 0 Z ∞ 0 g(A, E, Tm) dπ(A, E, Tm | ~ y) ≈ J X j=1 g(Aj, Ej, Tj) where K is a constant and {Aj, Ej, Tj}J j=1 are the MCMC samples (after discarding burn-in). 16 / 19

Posterior for Lcr 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 17 / 19

Advanced Methods Parallel and distributed computation Sequential updating of π(~ θ | ~ y) and streaming inference Spatio-temporal modelling of F(x, y, z, t; ~ θ) Emulation of the forward map Θ → Y (e.g. using artificial neural networks) Approximating the likelihood L (e.g. ABC) Accounting for multi-modality (e.g. ill-posed problems) Accounting for error in the numerical solver (probabilistic numerics) Accounting for model misspecification (discrepancy function) Multi-level Monte Carlo methods Hamiltonian Monte Carlo 18 / 19

Further Reading R. J. Longbottom, B. J. Monaghan, D. J. Pinson, N. A. S. Webster S. J. Chew In situ Phase Analysis during Self-sintering of BOS Filter Cake for Improved Recycling. ISIJ International 60(11): 2436–2445, 2020. A. M. Stuart Inverse problems: A Bayesian perspective. Acta Numerica 19: 451—559, 2010. L.C. Astfalck, E.J. Cripps, J.P. Gosling, M.R. Hodkiewicz I.A. Milne Expert elicitation of directional metocean parameters. Ocean Engineering 161: 268–276, 2018. B. P. Carlin A. E. Gelfand An iterative Monte Carlo method for nonconjugate Bayesian analysis. Statistics Computing 1(2): 119–128, 1991. 19 / 19

Bayesian Inference and Uncertainty Quantification for Inverse Problems

More Related Content

What's hot

Similar to Bayesian Inference and Uncertainty Quantification for Inverse Problems

More from Matt Moores

Recently uploaded

Bayesian Inference and Uncertainty Quantification for Inverse Problems