Questions tagged [bayesian-probability]
The bayesian-probability tag has no summary.
 89 questions 
   0  votes 
   0  answers 
   84  views 
    Bound on covariance of Gaussian-reweighted probability measure
 The problem Let $\rho$ be a probability measure on $\mathbb R^d$ and define the probability measure $\nu_x$ via $$ \mathrm d\nu_x(y) \propto \exp(-(y-x)^TK^{-1}(y-x))\mathrm d\rho(y), $$ where $K$ is ... 
    0  votes 
   1  answer 
   92  views 
   Rate of convergence for the mean square error $\mathbb{E}[||\hat{\theta}_n - \theta^* ||^2 ]$ of MAP estimators
 I study the statistical properties of the Maximum A Posteriori (MAP) estimator under favorable conditions. Given $n$ samples, the MAP estimator is defined by $$ \hat{\theta}_n = \arg\max_{\theta} \... 
    1  vote 
    1  answer 
   120  views 
    On the MSE of MAP estimators and model mismatch
 We consider an estimation problem where the parameter $\theta$ is assigned with the prior $g_\alpha$ depending on some parameter $\alpha$ (e.g. the variance of a Gaussian prior) and the observation $... 
    4  votes 
   0  answers 
   178  views 
    Is there a structural way to show that Ridge regression is equal to a weighted average of lower dimensional models?
 Let $X$ be an $n \times k$ matrix. An interesting result of Leamer and Chamberlain (1976) establishes that the Ridge estimator satisfies the following identity \begin{equation}\label{eq:1} \hat{\beta}... 
    1  vote 
   0  answers 
   98  views 
  Convergence of iterated average Bayesian posterior to high entropy distribution
 Setup Assume $p_Y \in \Delta^n$ is a probability vector obtained by $p_Y=L_{Y|X}p_X$, where $L_{Y|X} \in \mathbb{R}^{n \times m}$ is an arbitrary likelihood (i.e, a column stochastic matrix) and $p_X \... 
    0  votes 
   0  answers 
   175  views 
   Which proposal distribution should be used in this particular case of the Metropolis-Hastings algorithm?
 As part of my research, I would like to apply the Metropolis-Hastings in order to sample from some posterior distribution. More precisely, the data comes from a multivariate normal distribution in the ... 
    1  vote 
    1  answer 
   165  views 
    Bayes classifiers with cost of misclassification
 A minimum ECM classifier disciminate the features $\underline{x}$ to belong to class $t$ ($\delta(\underline{x}) = t$) if $\forall j \ne t$: $$\sum_{k\ne t} c(t|k) f_k(\underline{x})p_k \le \sum_{k\ne ... 
    1  vote 
   0  answers 
   93  views 
    Gibbs Priors form a Martingale
 I am working on adapting variational inference to the recently developed Martingale posterior distributions. The first case, which reduces the VI framework to Gibbs priors, is proving hard to show as ... 
    2  votes 
   1  answer 
   378  views 
     Sum of arrival times of Chinese Restaurant Process (CRP)
 Suppose that a random sample $X_1, X_2, \ldots$ is drawn from a continuous spectrum of colors, or species, following a Chinese Restaurant Process distribution with parameter $|\alpha|$ (or ... 
    0  votes 
   1  answer 
   174  views 
     Existence and uniqueness of a posterior distribution
 I am wondering about the existence and uniqueness of a posterior distribution. While Bayes' theorem gives the form of the posterior, perhaps there are pathological cases (over some weird probability ... 
    10  votes 
   1  answer 
   400  views 
    Who introduced the term hyperparameter?
 I am trying to find the earliest use of the term hyperparameter. Currently, it is used in machine learning but it must have had earlier uses in statistics or optimization theory. Even the multivolume ... 
    0  votes 
   0  answers 
   84  views 
    Canonical information geometry for probability distributions on different parameter spaces
 I am interested in a canonical information geometry on spaces of probability distributions containing distributions with different parameter spaces. Let me give some context and practical motivation ... 
    3  votes 
   0  answers 
   104  views 
   Confusion with implementation of PDE constraint Bayesiain inverse problem
 Consider a PDE, $$\partial_t u -a \nabla u - ru (1-u) = 0$$ at a given snapshot in time. The inverse problem is to find the diffusion coefficient $a \in L^{\infty}$ from a noisy measurement $$Y = \Phi(... 
    0  votes 
   0  answers 
   100  views 
    Probability distribution for a Bayesian Update
 I am struggling with a process like this: $$X_t=\begin{cases} \frac{\alpha\omega_t}{\alpha\omega_t+\beta(1-\omega_t)} & \text{with prob } p\\ \frac{(1-\alpha)\omega_t}{(1-\alpha)\omega_t+(1-\beta)(... 
    0  votes 
    1  answer 
   138  views 
     How does this Bayesian updating work $z_i=f+a_i+\epsilon_i$
 $z_i=f+a_i+\epsilon_i$ ,where $f\sim N(\bar{f},\sigma_{f}^2)$ ; $a_i\sim N(\bar{a_{i}},\sigma_{a}^2)$; $\epsilon_i\sim N(0,\sigma_{\epsilon}^2)$. We can see the signals $\{z_i\}$ where $i\subseteq {1,... 
    1  vote 
   0  answers 
   168  views 
    Curvature of randomly generated B-spline curve
 I am working on Bayesian statistical estimation of parameters (control points) of closed B-spline curve bounding an object on a an image. The problem is that I require those curves to not be much &... 
    2  votes 
    1  answer 
   270  views 
   Derive equation for regularized logistic regression with batch updates
 I am trying to understand this paper by Chapelle and Li "An Empirical Evaluation of Thompson Sampling" (2011). In particular, I am failing to derive the equations in algorithm 3 (page 6). ... 
    2  votes 
   0  answers 
   84  views 
  Concentration of posterior probability around a tiny fraction of the prior volume
 In the context of approximating the evidence $Z$ in a Bayesian inference setting $$ Z = \int d\theta \mathcal L (\theta)\pi (\theta) $$ with $\mathcal L$ the likelihood, $\pi$ the prior, John Skilling'... 
    1  vote 
   1  answer 
   167  views 
     Bayesian inverse problems on non-separable Banach spaces
 I am now studying Bayesian inverse problems. In the note of Dashti and Stuart https://arxiv.org/abs/1302.6989, they mentioned that "... when considering a non-separable Banach space $B$, it is ... 
    1  vote 
    1  answer 
   212  views 
    Conditional Gaussians in infinite dimensions
 I asked this over on cross validated, but thought it might also get an answer here: The law of the conditional Gaussian distribution (the mean and covariance) are frequently mentioned to extend to the ... 
    4  votes 
   2  answers 
   253  views 
     Do these distributions have a name already?
 In playing with some math finance stuff I ran into the following distribution and I was curious if someone had a name for it or has studied it or worked with it already. To start, let $\Delta^n$ be ... 
    3  votes 
   1  answer 
   313  views 
     A quantity associated to a probability measure space
 Let $(S,P)$ be a (finite) probability space. We associate to $(S,P)$ a quantity $n(S,P)$ as follows: The probability of two randomly chosen events $A,B\subset S$ being independent is denoted by $n(S,P)... 
    4  votes 
   1  answer 
   665  views 
     Gaussian process kernel parameter tuning
 I am reading on gaussian processes and there are multiple resources that say how the parameters of the prior (kernel, mean) can be fitted based on data,specifically by choosing those that maximize the ... 
    1  vote 
   0  answers 
   66  views 
  Estimation of probability matrix from samples at different time intervals
 I am given discrete-time Markov chain that evolves on a finite subset $\{1,\dots,n\}$. This Markov chain is time-homogeneous and has a transition matrix $P$ that I want to estimate. Let $X_t$ be the ... 
    0  votes 
    1  answer 
   229  views 
     CLT for random variables with positive support (e.g. exponential)
 I have a bunch of iid $\{X_i\}$ with $X_i \sim \exp(\lambda)$ - let's say $\lambda = 1$. Now, classic version of CLT tells me: \begin{equation} \sqrt{n}\left(1-\bar{X}_n\right) \rightarrow \mathcal{N}\... 
    1  vote 
   0  answers 
   95  views 
    2d interpolation minimizing the integral of the norm of the Hessian
 It is well known that cubic interpolation is the solution of the interpolation problem that minimizes the integral of the square of the second derivative: $$ min_{f \text{ s.t. } f(x_i)=y_i} \int (f''(... 
    0  votes 
    1  answer 
   226  views 
   Lower bound for reduced variance after conditioning
 Let $X$ be a random variable with variance $\tau^2$ and $Y$ be another random variable such that $Y-X$ is independent of $X$ and has mean zero and variance $\sigma^2$. (One can think of $Y$ as a noisy ... 
    -1  votes 
   1  answer 
   88  views 
   Linear operator over a simplex space in a multinomial distribution parameter estimation problem
 This is actually a variant of a well-known problem of how the parameters of a multinomial distribution can be estimated by maximum likelihood, and this arises from a final year project I undertook ... 
    1  vote 
    1  answer 
   412  views 
     Posterior expected value for squared Fourier coefficients of random Boolean function
 Let $f : \{0, 1\}^{n} \rightarrow \{-1, 1\}$ be a Boolean function. Let the Fourier coefficients of this function be given by $$ \hat f(z) = \frac{1}{2^{n}} \sum_{x \in \{0, 1\}^{n}} f(x)(-1)^{x \cdot ... 
    1  vote 
   0  answers 
   96  views 
   Bayesian inference of stochastically evolving model parameters
 I have a question related to self-calibration in radio interferometry, but I will try to phrase it as generic as possible. I have a set of data points, $D = \{ d_{0, t_0}, d_{1, t_0}, ..., d_{M, t_0}, ... 
    1  vote 
   1  answer 
   2k  views 
    Convolution of two Gaussian mixture model
 Suppose I have two independent random variables $X$, $Y$, each modeled by the Gaussian mixture model (GMM). That is, $$ f(x)=\sum _{k=1}^K \pi _k \mathcal{N}\left(x|\mu _k,\sigma _k\right) $$ $$ g(y)=\... 
    3  votes 
   0  answers 
   234  views 
  Minimizing an f-divergence and Jeffrey's Rule
 My question is about f-divergences and Richard Jeffrey's (1965) rule for updating probabilities in the light of partial information. The set-up: Let $p: \mathcal{F} \rightarrow [0,1]$ be a ... 
    1  vote 
   0  answers 
   86  views 
   Quantitative bounds on convergence of Bayesian posterior
 Let $Y$ be a random variable in $[0,1]$, and let $X_1, X_2, \ldots$ be a sequence of random variables in $[0,1]$. Suppose that the $X_i$'s are conditionally i.i.d given $Y$ ; in other words, I'd like ... 
    2  votes 
   0  answers 
   123  views 
    Is there any good reference on the Bayesian view that can be helpful for reading papers on the number theory using heuristic arguments?
 Nowadays there are many papers on the number theory using heuristics. I have read some of them. But I have no clear understanding of the Bayesian Probability(subjective probability). The concept of ... 
    4  votes 
   0  answers 
   256  views 
     Convergence of the expectation of a random variable when conditioned on its sum with another, independent but not identically distributed
 Suppose that for all $n \in \mathbf{N}$, $X_n$ and $Y_n$ are independent random variables with $$X_n \sim \mathtt{Binomial}(n,1-q),$$ and $$Y_n \sim \mathtt{Poisson}(n(q+\epsilon_n)),$$ where $q \in (... 
    -1  votes 
   1  answer 
   441  views 
   Proving the existence of a symmetric Bayesian Nash equilibrium
 I am currently faced with the following question: Consider the public goods game. Suppose that there are $I > 2$ players and that the public goods is supplied (with benefit of 1 for all players) ... 
    0  votes 
   0  answers 
   71  views 
   restriction of a formula with matrix inverse multiplied by a vector
 I'm trying to reproduce a proof from this paper but I'm stuck in one point (Lemma 6). The general subject is bayesian model for multi-armed bandit problem solved with Thompson sampling. I think I ... 
    1  vote 
   0  answers 
   78  views 
   Bayesian posterior consistency when prior distribution is induced by a diffusion
 Let $\Pi_{b,\sigma}$ be a prior distribution on $\{z_t\}_{t<T}\in C_0[0,T]$ induced by the following diffusion: \begin{align} d\tilde z_t&=b(\tilde z_t,t)dt+\sigma(\tilde z_t,t) dW_t, ~... 
    2  votes 
   0  answers 
   150  views 
   Convergence of Bayesian posterior
 Let $\Delta [0,1]$ denote the set of all probability distributions on the unit interval. Let $\mu \in \Delta [0,1]$ denote an arbitrary prior. Importantly, $\mu$ does not necessarily admit a density ... 
    1  vote 
    1  answer 
   172  views 
    Conditional density for random effects prediction in GLMM
 I am currently working on generalized linear mixed models (GLMM) and need some help concerning the prediction of the random effects. More specifically, I don't understand the given representation of ... 
    0  votes 
   1  answer 
   531  views 
   Optimal solution to cross entropy loss in the continuous case
 This could be a simple question but I don't have a satisfying answer. Setup. Suppose that we have $K$ different classes, and consider cross entropy loss which maps a probability vector in the ... 
    9  votes 
   3  answers 
   564  views 
    What does the KL being symmetric tell us about the distributions?
 Suppose two probability density functions, $p$ and $q$, such that $\text{KL}(q||p) = \text{KL}(p||q) \neq 0$. Intuitively, does that tell us anything interesting about the nature of these densities? 
    3  votes 
   0  answers 
   96  views 
    Have stick-breaking priors with non-iid atoms been considered, and if not, why not?
 Roughly speaking, a stick-breaking prior is a random discrete probability measure $P$ on a measurable space $\mathcal X$ of the form $$P=\sum_{j\ge1}w_j\delta_{\theta_j}$$ where $(w_j)_{j\ge1}$ is a ... 
    3  votes 
   1  answer 
   2k  views 
     Bayesian Inference with Student-t likelihood
 Suppose I've observed $x$ from a Student-t distribution with unknown $\mu$, and I'd now like to infer $\mu$. Since the t-distribution isn't exponential family, there's no conjugate prior available, ... 
    5  votes 
   1  answer 
   429  views 
     Bounding the sensitivity of a posterior mean to changes in a single data point
 There is a real-valued random variable $R$. Define a finite set of random variables ("data points") $$X_i = R + Z_i \; \text{for } i\in\{1,\ldots,n\},$$ where $Z_i$ are identically and independently ... 
    4  votes 
   0  answers 
   756  views 
  Bayesian Networks and Polytree
 I am a bit puzzled by the use of polytree to infer a posterior in a Bayesian Network (BN). BN are defined as directed acyclic graphs. A polytree is DAG whose underlying undirected graph is a tree. ... 
    3  votes 
    2  answers 
   766  views 
    Parametrising a sparse orthogonal matrix
 I need to find a way to parametrise a matrix that is both sparse (to some degree) and orthogonal, i.e., I am looking for a parametrisation that describes $A \in \mathbb{R}^{n\times m}$ such that $AA^𝑇... 
    1  vote 
   0  answers 
   495  views 
    Gaussian Integrals over Spheres
 I'm after a reference for an integral. In particular, I am looking a way to approximate or calculate the following: $$ \int \limits_{\| \theta \|_2 = 1} e^{(-(\theta - \mu)^T \Sigma (\theta - \mu))} ... 
    1  vote 
    1  answer 
   164  views 
     The expectation of binary logistics regression with respect to Gaussian distribution
 I am trying to compute the expectation of $g(s,x)=s \ln \sigma(x)+(1-s)\ln(1-\sigma(x))$ with respect to the normal distribution $\mathcal{N}(x;m,v)$, where we have $\sigma(x)=\frac{1}{1+e^{-x}}$. If ... 
    3  votes 
   1  answer 
   829  views 
     Bayesian methods in online setting
 Imagine the following (very concrete) model: We have a series of random variables $x_k$ with values in $\lbrace 0, 1\rbrace$. We assume $x_k \mid p_k \sim \operatorname{Alt}(p_k),$ where $p_0 \sim R(0,...