0
$\begingroup$

I am reading this paper, which gives the following coupling result:

Throughout this, I'll assume the dimension $k$ is clear. Let $e_i$ be the $i$-th basis in the $k$ dimensional standard basis.

A $k$ dimensional random variable $X$ is a multinomial bernoulli random variable if it is parameterized by $p \in [0,1]^k$ such that $X \in \{{\bf 0}, e_1, ..., e_k\}$ with $\Pr[X=e_i]=p_i$ and $\Pr[X={\bf 0}]=1-\sum_i p_i$.

Let $X_1, \ldots, X_n $ be independent multinomial bernoulli random variables parameterized by $p_1, ..., p_n$. Let $Y_1, ..., Y_n$ be $k$ dimensional poisson random variables where dimension $j$ of $Y_i$ is an independent Poisson $\text{Poisson}(p_{i,j})$.

Let $S_n = \sum_{i=1}^n X_i$ and $T_n = \sum_{j=1}^n Y_j$. Then the total variational distance $d(S_n, T_n)$ has the bound

$$d(S_n, T_n) = \sup_A |\Pr(S_n \in A)-\Pr(T_n \in A)| \leq \sum_{i=1}^n \left( \sum_{j=1}^k p_{i,j} \right)^2$$


Question:

The proof is fairly short. First, they use the fact that $d(X, Y)\leq \Pr[X\neq Y]$. So we have

$$d(S_n, T_n) \leq \Pr[S_n \neq T_n] \leq \sum_{i=1}^n \Pr[X_i \neq Y_i]$$ (where the last inequality uses union bound, or Boole's inequality). By using maximum coupling on $(X_i, Y_i)$, we get $\Pr[X_i \neq Y_i] \leq \Pr[X_i^\ast \neq Y_i^\ast]=d(X_i, Y_i)$. Finally, they bound $d(X_i, Y_i) \leq \left( \sum_{j=1}^k p_{i,j} \right)^2$.

My question is, where in this proof does it use independence of $X_1, ..., X_n$? Specifically, union bound does not require independence?


$\endgroup$

1 Answer 1

2
$\begingroup$

This proof is of course incorrect. In particular, the inequality $\Pr[X_i \neq Y_i] \leq \Pr[X_i^\ast \neq Y_i^\ast]$ actually goes in the opposite direction (this inequality does not seem to be in the paper linked in your post).

A correct proof is as follows. For each $i$, let $(X_i^*,Y_i^*)$ be the closest coupling of $(X_i,Y_i)$, so that $d(X_i,Y_i)=P(X_i^*\ne Y_i^*)$. Suppose also that the random pairs $(X_i^*,Y_i^*)$ are independent. Let $S_n^*:=\sum_{i=1}^n X_i^*$ and $T_n^*:=\sum_{i=1}^n Y_i^*$. Then $S_n^*$ equals $S_n$ in distribution and $T_n^*$ equals $T_n$ in distribution (because the $X_i$'s are independent and the $Y_i$'s are independent). So, \begin{equation} d(S_n,T_n)=d(S_n^*,T_n^*)\le P(S_n^*\ne T_n^*)\le\sum_{i=1}^n P(X_i^*\ne Y_i^*) =\sum_{i=1}^n d(X_i,Y_i). \end{equation}

$\endgroup$
4
  • $\begingroup$ Thanks for the answer. I'm still a bit confused how independence implies $S_n^\ast$ equals $S_n$ in distribution? (I just learned about coupling recently, so this might be an obvious question). $\endgroup$ Commented Sep 12, 2023 at 1:54
  • $\begingroup$ @AspiringMat : The $X_i$'s are independent and the $X_i^*$'s are independent and each $X_i^*$ equals $X_i$ in distribution. So, $(X_1^*,\dots,X_n^*)$ equals $(X_1,\dots,X_n)$ in distribution. So, $S_n^*=X_i^*+\dots+X_n^*$ equals $S_n=X_i+\dots+X_n$ in distribution. $\endgroup$ Commented Sep 12, 2023 at 2:00
  • $\begingroup$ I see I misunderstood their proof. They were actually doing what you're doing (but it's not as clear as your writing). Thank you! $\endgroup$ Commented Sep 12, 2023 at 3:34
  • $\begingroup$ Thanks I accepted it (I previously upvoted it) $\endgroup$ Commented Sep 12, 2023 at 14:40

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.