3
$\begingroup$

Suppose I have $n$ independent 0-1 random variables $X_1, \cdots, X_n$ and I want to show a concentration of $X = \sum_i X_i$.

I can use either the Chernoff bound or the Hoeffding bound.

Suppose $E[X] = O(1)$. Then, I should use the Chernoff bound which will give me Poissonian tails. On the other hand, if $E[X] = O(n)$ then using the Hoeffding bound will give me a better result -- namely Gaussian tails.

However, what if, say, $E[X] = \sqrt{n}$. It is hard to believe that Chernoff is the best I can do. Is there some better inequality for this case? And what if $E[X] = n/\log n$ or anything else between $O(1)$ and $O(n)$?


Edit in response to Kodlu's answer: Using Chernoff, one gets that there is subgasussian concentration at an interval around the mean that gets bigger with increasing $\mu$. However, outside of this interval ($\delta > 1$), this still gives us only Poissonian tails. Now, my conjecture is that one can actually get better bounds.

$\endgroup$
1
  • $\begingroup$ If you are interested in the subject, I recommend Louis Chen's paper or papers on applying Stein's method to poisson approximation, though large deviation type results are probably not to be found there. $\endgroup$ Commented Jun 2, 2020 at 12:19

2 Answers 2

3
$\begingroup$

$\newcommand\ep{\varepsilon}$ $\newcommand\si{\sigma}$ $\newcommand\Ga{\Gamma}$ $\newcommand\tPi{\tilde\Pi}$ It follows from Theorem 2.1 of this paper or of its better version that for a large class, say $\mathcal F$, of nondecreasing functions $f$, containing the class of all increasing exponential functions, we have $$Ef(X-EX)\le Ef(Y),\tag{1}$$ where $$Y:=\Ga_{(1-\ep)\si^2}+y\tPi_{\ep\si^2/y^2},$$ $\Ga_{a^2}\sim N(0,a^2)$, $\tPi_\theta$ has the centered Poisson distribution with parameter $\theta$, $\Ga_{(1-\ep)\si^2}$ and $\tPi_{\ep\si^2}$ are independent, $$y:=\max_i q_i,\quad \si^2:=\sum_i p_i q_i,\quad\ep:=\sum_i p_i q_i^3/(\si^2 y)\in[0,1],$$ $p_i:=P(X_i=1)$, and $q_i:=1-p_i$. We see that $\ep\in[0,1]$ "interpolates" between the Gaussian and (re-scaled centered) Poisson random variables (r.v.'s) $\Ga_{\si^2}$ and $y\tPi_{\si^2/y^2}$.

From here, one can immediately get exponential bounds on the tails of $X$, or one can get better bounds such as $$P(X-EX\ge x)\le\frac{2e^3}9\,P^{LC}(Y\ge x)$$ for all real $x$, where $P^{LC}(Y\ge\cdot)$ denotes the least log-concave majorant of the tail function $P^{LC}(Y\ge\cdot)$; see Corollary 2.2 and Corollary 2.7 in the linked papers, respectively.

To see better how this works, consider the iid case, with $p_i=p$ and hence $q_i=q=1-p$ for all $i$. Then $y=q=\ep$ and

(i) if $p$ is small then $\ep=q$ is close to $1$ (and $y=q$ is also close to $1$) and hence $Y$ is close to the centered Poisson r.v. $\tPi_{\si^2}$;

(ii) if $q$ is small then $\ep=q$ is small and hence $Y$ is close to the Gaussian r.v. $\Ga_{\si^2}$;

(iii) if neither $p$ nor $q$ is small but $n$ is large then $\ep=q$ is not small and $\si^2$ is large, and hence $\tPi_{\ep\si^2/y^2}$ is close to $\Ga_{\ep\si^2/y^2}$ in distribution, so that $Y$ is close in distribution to the Gaussian r.v. $\Ga_{\si^2}$, just as in Case (ii).


For $f(x)\equiv e^{tx}$ with real $t\ge0$, (1) becomes $$E\exp\{t(X-EX)\} \le\exp\Big\{\frac{t^2}2\si^2(1-\ep)+\frac{e^{ty}-1-ty}{y^2}\,\si^2\ep\Big\}\tag{2};$$ cf. e.g. formula (1.5) in the better version of the linked paper, which implies $$P(X-EX\ge x) \le\inf_{t\ge0}\exp\Big\{-tx+\frac{t^2}2\si^2(1-\ep)+\frac{e^{ty}-1-ty}{y^2}\,\si^2\ep\Big\}$$ for real $x\ge0$. The latter $\inf$ can be explicitly expressed in terms of Lambert’s product-log function -- see the expression in formula (3.2) in the same paper; another useful expression for the same $\inf$ is given by formula (A.3) in this other paper.

$\endgroup$
2
  • $\begingroup$ Thank you for your answer. It is more complex than I expected and I don't have the time to delve into the paper now. I will do it as soon as I can and I will accept the answer then, for now only giving an upvote. $\endgroup$ Commented Jun 3, 2020 at 16:15
  • $\begingroup$ I have provided details on the exponential bound on the tail. $\endgroup$ Commented Jun 3, 2020 at 23:45
1
$\begingroup$

I am not an expert, bu I know that Chernoff can be optimized. Let $\mathbb{E}[X]=\mu,$ then or any positive $\Delta,$ we have $$ \mathbb{P}[X\geq E[X]+\Delta]\leq e^\Delta\left(\frac{\mu}{\mu+\Delta}\right)^{\mu+\Delta}, $$ and $$ \mathbb{P}[X\leq E[X]-\Delta]\leq e^{-\Delta}\left(\frac{\mu}{\mu\Delta}\right)^{\mu-\Delta}.\quad $$ Similarly for any positive $\delta,$ we have $$ \mathbb{P}[X\geq E[X](1+\delta)]\leq \left(\frac{e^\delta}{(1+\delta)^{1+\delta}}\right)^\mu, $$ and $$ \mathbb{P}[X\leq E[X](1-\delta)]\leq \left(\frac{e^{-\delta}}{(1-\delta)^{1-\delta}}\right)^\mu. $$ For simplicity, I will consider a standard simplification (and weakening) of the multiplicative bounds, which is a bit weaker. For any $\delta \in (0,1),$ we have $$ \mathbb{P}[X\geq E[X](1+\delta)]\leq \exp[-\delta^2 \mu/3] $$ and $$ \mathbb{P}[X\leq E[X](1-\delta)]\leq \exp[-\delta^2 \mu/2] $$ As an example, if $\mu=\mathbb{E}[X]=\sqrt{n},$ then we obtain (by taking the weaker lower tail) $$ \mathbb{P}[|X- \mu|\geq \delta \mu]\leq 2\exp[-\delta^2 \mu/3], $$ or equivalently letting $x=\delta \mu,$ $$ \mathbb{P}[|X- \mu|\geq x]\leq 2\exp[-(x^2/\mu^2) \mu/3]=2\exp[-x^2/3\mu]= 2\exp\left[\frac{-x^2}{3\sqrt{n}}\right]. $$ So how tight the bound is depends on the exact value of $\mu$ and the value of $x,$ the distance from the expectation. If $x=c \sqrt{n},$ so you look at bounding departures of the order of the mean you get an upper bound of the form $\exp[-c \sqrt{n}].$

$\endgroup$
2
  • $\begingroup$ Thank you for your answer. I have edited the question in response. Also, I think that weaker of the two bounds is the one with "divided by 3" in the exponent and there should be 2 in front of the $\exp$ when having bound on the absolute value. $\endgroup$ Commented Jun 2, 2020 at 13:08
  • $\begingroup$ you're right. I'll fix it. $\endgroup$ Commented Jun 2, 2020 at 14:10

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.