Center for Uncertainty Quantification Minimum mean square error estimation and approximation of the Bayesian update A. Litvinenko1, H.G. Matthies2, E. Zander2 1 CEMSE Division, KAUST, 2 TU Braunschweig, Germany alexander.litvinenko@kaust.edu.sa Center for Uncertainty Quantification Center for Uncertainty Quantification Abstract Given: a physical system modeled by a PDE or ODE with uncertain coefficient q(ω), a measurement operator Y (u(q), q), where u(q, ω) uncertain solution. Aim: to identify q(ω). The mapping from parameters to observations is usually not invertible, hence this inverse identification problem is generally ill-posed. To identify q(ω) we derived non-linear Bayesian update from the variational problem associated with conditional expectation. To reduce cost of the Bayesian update we offer a functional ap- proximation, e.g. polynomial chaos expan- sion (PCE). New: We derive linear, quadratic etc approx- imation of full Bayesian update. 1. Bayesian Updating and conditional expectation Let measurement operator Y : Q × U (q, uk) → yk = Y (q; uk) ∈ Y, Observation z(ω) = ˆy + ε(ω), random mea- surement error ε, ‘truth’ ˆy. Bayes’s theorem may be formulated as (Tarantola2004 Ch. 1.5) πq(q|z) = p(z|q) Zs pq(q), (1) where pq is the pdf of q, p(z|q) is the likeli- hood. The Bayesian update as conditional expectation is: E (q|S) := PQ∞ (q) := arg min˜q∈Q∞ q − ˜q 2 Q, (2) where Q := Q ⊗ S, S a sub-σ-algebra. Proposition 1There is a unique minimiser to the problem in (2), denoted by E (q|S) = PQ∞ (q) ∈ Q∞, and it is characterised by the orthogonality condition ∀˜q ∈ Q∞ : q − E (q|S) , ˜q Q = 0. (3) Proposition 2The subspace Q∞ = Q⊗S∞ is given by Q∞ = span{ϕ | ϕ(φ, q) := φ(Y (q)+ε); φ ∈ L0(Y ; Q) s.t. ϕ ∈ Q}. Finding the conditional expectation may be seen as rephrasing (2) as: qa := E (q|σ(Y )) := PQ∞ (q) = arg minφ∈L0(Y ;Q) q − ϕ(φ, q) 2 Q. Approximation of the conditional expec- tation Assume that L0(Y ; Q) in (2) is approximated by subspaces L0,n ⊂ L0(Y ; Q), where n ∈ N is a parameter describing the level of approx- imation and L0,n ⊂ L0,m if n < m, such that the subspaces Qn = span{ϕ(φ, q) | φ ∈ L0,n ⊂ L0(Y ; Q) s.t. ϕ ∈ Q} ⊂ Q∞ are closed and their union is dense n Qn = Q∞, Proposition 3 Define PQn (q) := arg minφ∈L0,n q − ϕ(φ, q) 2 Q. Then the sequence qa,n := PQn (q) converges to qa := PQ∞ (q): lim n→∞ qa − qa,n 2 Q = 0. (4) Theorem 4 With qa,n the condition (3) becomes for any n ∈ N0: ∀ = 0, . . . , n : δ( H) q − qa( H0 , . . . , Hn )) 2 Q = 0, which determine the Hk and may be written as H0 · · · + Hk z∨k · · · + Hn z∨n = q , H0 z · · · + Hk z∨(1+k) · · · + Hn z∨(1+n) = q ⊗ z , ... . . . ... ... H0 z∨n · · · + Hk z∨(n+k) · · · + Hn z∨2n = q ⊗ z∨n . Example (n = 1): Linear Bayesian update H0 + H1 z = q H0 z + H1 z ⊗ z = q ⊗ z . Solving for H1 and H0 , obtain H1 = [covqz][covzz]−1 =: K H0 = q − [covqz][covzz]−1 z . and linear BU will be qa = H0 + H1 z = q + K(z − z ). Example (n = 2): Quadratic Bayesian update H0 + H1 z + H2 z⊗2 = q H0 z + H1 z⊗2 + H2 z⊗3 = q ⊗ z H0 z⊗2 + H1 z⊗3 + H2 z⊗4 = q ⊗ z⊗2 . solve for H0 , H1 , H2 and qa = H0 + H1 z + H2 (z, z). 2. Minimum Mean Square Error Estimation (MMSE) Let X : Ω → X(=: Rn ) be the (a priori) stochastic model of some unknown QoI (uncertain parameter) Y : Ω → Y(=: Rn ) be the stochastic model (e.g. of measurement forecast). An estimator ϕ : Y → X is any (measurable) function from the space of measure- ments Y to the space of unknowns X of correspond- ing measurements mean square error eMSE defined by e2 MSE = E[ X(·) − ϕ(Y (·)) 2 2]. (5) Minimum mean square error estimator ˆϕ is the one that minimises error eMSE. Further: ˆϕ(Y ) = E[X|Y ], (6) • Minimising over the whole space of measurable functions is numerically also not possible • restrict to a finite dimensional function space Vϕ with basis functions Ψγ, indexed by some γ ⊂ J , J - a multi-index. • Ψγ can be e.g. a multivariate polynomial set and the γ corresponding multiindices • other function systems are also possible (e.g. ten- sor products of sines and cosines). • ϕ has a representation ϕ = y → γ∈J ϕγΨγ(y). (7) • Minimising (5) for Xi and ϕi gives ∂ ∂ϕi,δ E[(Xi − γ∈J ϕi,γΨγ(Y ))2 ] = 0 (8) for all δ ∈ J . • Using linearity leads to γ ϕi,γE[Ψγ(Y )Ψδ(Y )] = E[XiΨδ(Y )]. (9) • This can be written as a linear system Aϕi = bi with [A]γδ = E[Ψγ(Y )Ψδ(Y )], [bi]δ = E[XiΨδ(Y )] and the coefficients ϕi,γ collected in the vector ϕi. Aγδ = E Ψγ(Y )Ψδ(Y )T ≈ NA i=1 wA i Ψγ(Y (ξi))Ψδ(Y (ξi))T , bδ = E[XΨδ(Y )] ≈ Nb i=1 wb iX(ξi)Ψδ(Y (ξi)). After all components of MMSE mapping ˆϕ are computed, the new model for the parameter is given via qnew := qapost := ˆϕ(y + ε(ξ)), (10) where qnew = Xnew and y + ε(ξ) is a real noisy measurement of Y (ξ). In [1,3,4] we demonstrated that if ϕ is linear, than we obtain the well-known Kalman Filter. Hypotesis: Mapping ϕ is a polynomial of order n, it converge to full Bayessian update. 3. 1D elliptic PDE with uncertain coeffs − · (q(x, ξ) u(x, ξ)) = f(x, ξ), x ∈ [0, 1] Measurements are taken at x1 = 0.2, and x2 = 0.8. The means are y(x1) = 10, y(x2) = 5 and the variances are 0.5 and 1.5 correspondingly [1]. 0 0.2 0.4 0.6 0.8 1 −20 −10 0 10 20 30 0 0.2 0.4 0.6 0.8 1 −20 −10 0 10 20 30 Figure 1: A priori (left) realizations of the solution u and a poste- riori (updated) solutions, mean value plus/minus 1,2,3 standard deviations. Uncertainty decreases in/near measurement points x = {0.2, 0.5}. 0 0.2 0.4 0.6 0.8 1 −1 −0.5 0 0.5 1 1.5 2 0 0.2 0.4 0.6 0.8 1 −1 −0.5 0 0.5 1 1.5 2 Figure 2: A priori and a posteriori realizations of the param- eter q(x). Uncertainty decreases in/near measurement points x = {0.2, 0.5}. 4. Conclusion and Future plans 1. + Introduced a way to derive MMSE ϕ (as a linear, quadratic, cubic etc approximation, i. e. compute conditional expectation of q, given measurement Y . 2. + Apply ϕ to identify parameter, i.e. compute E(q|Y = y + ε) 3. + All ingredients can be given as gPC. 4. + Apply this approximate BU to solve inverse prob- lems (ODEs and PDEs). Future plans: 1. Will compute a posteriory via MCMC and compare with results of (non-)linear BU 2. Will compare linear, quadratic, cubic Bayesian up- dates (convergence) 3. Will compute KLD between linear, non-linear and MCMC. Acknowledgements: A. Litvinenko is a member of the KAUST SRI Center for Uncertainty Quantification in Computational Sci- ence and Engineering. References 1. A. Litvinenko and H. G. Matthies, Inverse problems and uncertainty quan- tification, http://arxiv.org/abs/1312.5048, (2013). 2. E. Zander, Stochastic Galerkin library https://github.com/ezander/sglib 3. B. V. Rosic, A. Kucerova, J. Sykora, O. Pajonk, A. Litvinenko, H. G. Matthies, Parameter Identification in a Probabilistic Setting, Engineering Structures (2013). 4. O. Pajonk, B. V. Rosic, A. Litvinenko, and H. G. Matthies, A Deterministic Filter for Non-Gaussian Bayesian Estimation, - applications to dynamical system estimation with noisy measurements. Physica D: Nonlinear Phe- nomena, Vol. 241(7), pp. 775-788.

Minimum mean square error estimation and approximation of the Bayesian update

  • 1.
    Center for Uncertainty Quantification Minimummean square error estimation and approximation of the Bayesian update A. Litvinenko1, H.G. Matthies2, E. Zander2 1 CEMSE Division, KAUST, 2 TU Braunschweig, Germany alexander.litvinenko@kaust.edu.sa Center for Uncertainty Quantification Center for Uncertainty Quantification Abstract Given: a physical system modeled by a PDE or ODE with uncertain coefficient q(ω), a measurement operator Y (u(q), q), where u(q, ω) uncertain solution. Aim: to identify q(ω). The mapping from parameters to observations is usually not invertible, hence this inverse identification problem is generally ill-posed. To identify q(ω) we derived non-linear Bayesian update from the variational problem associated with conditional expectation. To reduce cost of the Bayesian update we offer a functional ap- proximation, e.g. polynomial chaos expan- sion (PCE). New: We derive linear, quadratic etc approx- imation of full Bayesian update. 1. Bayesian Updating and conditional expectation Let measurement operator Y : Q × U (q, uk) → yk = Y (q; uk) ∈ Y, Observation z(ω) = ˆy + ε(ω), random mea- surement error ε, ‘truth’ ˆy. Bayes’s theorem may be formulated as (Tarantola2004 Ch. 1.5) πq(q|z) = p(z|q) Zs pq(q), (1) where pq is the pdf of q, p(z|q) is the likeli- hood. The Bayesian update as conditional expectation is: E (q|S) := PQ∞ (q) := arg min˜q∈Q∞ q − ˜q 2 Q, (2) where Q := Q ⊗ S, S a sub-σ-algebra. Proposition 1There is a unique minimiser to the problem in (2), denoted by E (q|S) = PQ∞ (q) ∈ Q∞, and it is characterised by the orthogonality condition ∀˜q ∈ Q∞ : q − E (q|S) , ˜q Q = 0. (3) Proposition 2The subspace Q∞ = Q⊗S∞ is given by Q∞ = span{ϕ | ϕ(φ, q) := φ(Y (q)+ε); φ ∈ L0(Y ; Q) s.t. ϕ ∈ Q}. Finding the conditional expectation may be seen as rephrasing (2) as: qa := E (q|σ(Y )) := PQ∞ (q) = arg minφ∈L0(Y ;Q) q − ϕ(φ, q) 2 Q. Approximation of the conditional expec- tation Assume that L0(Y ; Q) in (2) is approximated by subspaces L0,n ⊂ L0(Y ; Q), where n ∈ N is a parameter describing the level of approx- imation and L0,n ⊂ L0,m if n < m, such that the subspaces Qn = span{ϕ(φ, q) | φ ∈ L0,n ⊂ L0(Y ; Q) s.t. ϕ ∈ Q} ⊂ Q∞ are closed and their union is dense n Qn = Q∞, Proposition 3 Define PQn (q) := arg minφ∈L0,n q − ϕ(φ, q) 2 Q. Then the sequence qa,n := PQn (q) converges to qa := PQ∞ (q): lim n→∞ qa − qa,n 2 Q = 0. (4) Theorem 4 With qa,n the condition (3) becomes for any n ∈ N0: ∀ = 0, . . . , n : δ( H) q − qa( H0 , . . . , Hn )) 2 Q = 0, which determine the Hk and may be written as H0 · · · + Hk z∨k · · · + Hn z∨n = q , H0 z · · · + Hk z∨(1+k) · · · + Hn z∨(1+n) = q ⊗ z , ... . . . ... ... H0 z∨n · · · + Hk z∨(n+k) · · · + Hn z∨2n = q ⊗ z∨n . Example (n = 1): Linear Bayesian update H0 + H1 z = q H0 z + H1 z ⊗ z = q ⊗ z . Solving for H1 and H0 , obtain H1 = [covqz][covzz]−1 =: K H0 = q − [covqz][covzz]−1 z . and linear BU will be qa = H0 + H1 z = q + K(z − z ). Example (n = 2): Quadratic Bayesian update H0 + H1 z + H2 z⊗2 = q H0 z + H1 z⊗2 + H2 z⊗3 = q ⊗ z H0 z⊗2 + H1 z⊗3 + H2 z⊗4 = q ⊗ z⊗2 . solve for H0 , H1 , H2 and qa = H0 + H1 z + H2 (z, z). 2. Minimum Mean Square Error Estimation (MMSE) Let X : Ω → X(=: Rn ) be the (a priori) stochastic model of some unknown QoI (uncertain parameter) Y : Ω → Y(=: Rn ) be the stochastic model (e.g. of measurement forecast). An estimator ϕ : Y → X is any (measurable) function from the space of measure- ments Y to the space of unknowns X of correspond- ing measurements mean square error eMSE defined by e2 MSE = E[ X(·) − ϕ(Y (·)) 2 2]. (5) Minimum mean square error estimator ˆϕ is the one that minimises error eMSE. Further: ˆϕ(Y ) = E[X|Y ], (6) • Minimising over the whole space of measurable functions is numerically also not possible • restrict to a finite dimensional function space Vϕ with basis functions Ψγ, indexed by some γ ⊂ J , J - a multi-index. • Ψγ can be e.g. a multivariate polynomial set and the γ corresponding multiindices • other function systems are also possible (e.g. ten- sor products of sines and cosines). • ϕ has a representation ϕ = y → γ∈J ϕγΨγ(y). (7) • Minimising (5) for Xi and ϕi gives ∂ ∂ϕi,δ E[(Xi − γ∈J ϕi,γΨγ(Y ))2 ] = 0 (8) for all δ ∈ J . • Using linearity leads to γ ϕi,γE[Ψγ(Y )Ψδ(Y )] = E[XiΨδ(Y )]. (9) • This can be written as a linear system Aϕi = bi with [A]γδ = E[Ψγ(Y )Ψδ(Y )], [bi]δ = E[XiΨδ(Y )] and the coefficients ϕi,γ collected in the vector ϕi. Aγδ = E Ψγ(Y )Ψδ(Y )T ≈ NA i=1 wA i Ψγ(Y (ξi))Ψδ(Y (ξi))T , bδ = E[XΨδ(Y )] ≈ Nb i=1 wb iX(ξi)Ψδ(Y (ξi)). After all components of MMSE mapping ˆϕ are computed, the new model for the parameter is given via qnew := qapost := ˆϕ(y + ε(ξ)), (10) where qnew = Xnew and y + ε(ξ) is a real noisy measurement of Y (ξ). In [1,3,4] we demonstrated that if ϕ is linear, than we obtain the well-known Kalman Filter. Hypotesis: Mapping ϕ is a polynomial of order n, it converge to full Bayessian update. 3. 1D elliptic PDE with uncertain coeffs − · (q(x, ξ) u(x, ξ)) = f(x, ξ), x ∈ [0, 1] Measurements are taken at x1 = 0.2, and x2 = 0.8. The means are y(x1) = 10, y(x2) = 5 and the variances are 0.5 and 1.5 correspondingly [1]. 0 0.2 0.4 0.6 0.8 1 −20 −10 0 10 20 30 0 0.2 0.4 0.6 0.8 1 −20 −10 0 10 20 30 Figure 1: A priori (left) realizations of the solution u and a poste- riori (updated) solutions, mean value plus/minus 1,2,3 standard deviations. Uncertainty decreases in/near measurement points x = {0.2, 0.5}. 0 0.2 0.4 0.6 0.8 1 −1 −0.5 0 0.5 1 1.5 2 0 0.2 0.4 0.6 0.8 1 −1 −0.5 0 0.5 1 1.5 2 Figure 2: A priori and a posteriori realizations of the param- eter q(x). Uncertainty decreases in/near measurement points x = {0.2, 0.5}. 4. Conclusion and Future plans 1. + Introduced a way to derive MMSE ϕ (as a linear, quadratic, cubic etc approximation, i. e. compute conditional expectation of q, given measurement Y . 2. + Apply ϕ to identify parameter, i.e. compute E(q|Y = y + ε) 3. + All ingredients can be given as gPC. 4. + Apply this approximate BU to solve inverse prob- lems (ODEs and PDEs). Future plans: 1. Will compute a posteriory via MCMC and compare with results of (non-)linear BU 2. Will compare linear, quadratic, cubic Bayesian up- dates (convergence) 3. Will compute KLD between linear, non-linear and MCMC. Acknowledgements: A. Litvinenko is a member of the KAUST SRI Center for Uncertainty Quantification in Computational Sci- ence and Engineering. References 1. A. Litvinenko and H. G. Matthies, Inverse problems and uncertainty quan- tification, http://arxiv.org/abs/1312.5048, (2013). 2. E. Zander, Stochastic Galerkin library https://github.com/ezander/sglib 3. B. V. Rosic, A. Kucerova, J. Sykora, O. Pajonk, A. Litvinenko, H. G. Matthies, Parameter Identification in a Probabilistic Setting, Engineering Structures (2013). 4. O. Pajonk, B. V. Rosic, A. Litvinenko, and H. G. Matthies, A Deterministic Filter for Non-Gaussian Bayesian Estimation, - applications to dynamical system estimation with noisy measurements. Physica D: Nonlinear Phe- nomena, Vol. 241(7), pp. 775-788.