3
$\begingroup$

I am working in statistics and during my research, I encountered the need to establish concentration inequalities. McDiarmid's inequality seems very relevant and closely related to my problem, but unfortunately I haven't studied it before.

I searched around but couldn't find an exact version that applies to my setting, so I hope to get some advice and references here. Let me start by briefly recalling the classical McDiarmid's setting.

  1. The bounded differences fucntion Consider an arbitrary set $\mathcal{Z}$. The function $f$: $\mathcal{Z}^{n} \to \mathbb{R}$ has bounded differences if there exist nonnegative constants $b_1, ...b_n \in \mathbb{R}$ such that for $i=1,...n$

$$\sup_{x_1,...x_n, x_i'}|f(x_1,\cdots,x_n)-f(x_1,\cdots,x_{i-1},x_i',x_{i+1},\cdots,x_n)| \le b_i$$

  1. McDiarmid's inequality Let the function $f: \mathcal{Z}^{n} \to R$

have bound differences with constants $b_1, \cdots b_n$. Consider a sample of independent variables $\{X_1,\dots, X_n\}$ from the set $Z$. Then for any $\epsilon \ge 0$, we have

$$\mathbb{P}\{f(X_1,\dots,X_n) - \mathbb{E}[f(X_1,\dots,X_n)] \ge \epsilon \} \le exp\left(\frac{-2\epsilon^2}{\displaystyle \sum_{i=1}^{n} b_i^2}\right)$$

My function is defined on the symmetric group $S_{n}$. so the components are not independent, and the classical McDiarmid inequality does not apply directly.

My situation: I have a function $f$ definied on $S_n$. Besides, I know that

$$|f(\pi) - f(\pi \circ (i,i+1))| \leq c_i$$

where $(i,i+1)$ denotes the adjacent transposition of positions $i$ and $i+1$. I am wondering if it is true that the following concentration inequality holds:

$$\mathbb{P}(f(\pi) - \mathbb{E}(f(\pi)) \ge \epsilon) \le exp\left(\frac{-2\epsilon^2}{\sum_{i=1}^{n-1}c_i}\right)$$

when $f$ changes by at most $c_i$ under adjacent swap $(i,i+1)$.

My question:

  1. Is there a known theorem or reference that establishes this type of concentration inequality for functions on $S_n$ with bounded differences with respect to adjacent transpositions?

  2. If this is true, is the proof based on Markov chains, martingale methods, or some other approach?

  3. Any pointers to papers or textbooks would be greatly appreciated.

Thank you very much for your time and help! I look forward to discussing this further.

$\endgroup$
2
  • $\begingroup$ (i) I think you wanted $\sum_{i=1}^{n-1}c_i^2$ instead of $\sum_{i=1}^{n-1}c_i$. (ii) Your conjectured bound seems unlikely, as the local, adjacent-neighbors control seems much limited globally. (iii) According to MathOverflow guidelines, users should avoid responding to posts that "request answers to multiple questions". I think you can pretty easily condense your three questions into one. $\endgroup$ Commented Apr 15 at 15:56
  • $\begingroup$ (i): yes you are right. I need $\sum_{i=1}^{n-1} c_i^2$ $\endgroup$ Commented Apr 15 at 16:09

1 Answer 1

3
$\begingroup$

The inequality as stated is surely not true. Let $f(X_1, \ldots X_n) \in [n]$ be defined as the $j$ s.t. $X_j = 1$. Clearly, for a uniformly random permutation $f(X)$ is just a uniform number in $\{1, \ldots, n\}$, with expectation around $n/2$, but it is not concentrated: $\Pr(f(X) < n/4) \approx 1/4$; but in this example, clearly swapping two neighboring elements can change value of $f$ by at most $1$, so all $c_i = 1$, and you are asking for $f(X) = n/2 \pm O(\sqrt{n})$ with high probability.

On the other hand, using Azuma-Hoeffding inequality (which can also be used to deduce the McDiarmid inequality), we can get a similar in spirit statement. Namely

Lemma Let $f : S_n \to \mathbb{R}$ be a function s.t. for every permutation $\pi$ and every transposition $(k, \ell)$ with $\ell > k$, we have $$ |f(\pi) - f(\pi \circ (k, \ell))| \leq c_k. $$ Let $Y$ be a uniformly random permutation in $S_n$. Then $$ \Pr(|f(Y_1, \ldots Y_n) - \mathbb{E}[f(Y_1, \ldots Y_n)]| > \lambda) \leq \exp\left(-\Omega\left(\frac{\lambda^2}{\sum_i c_i^2}\right)\right). $$ Proof. Indeed, consider a random permutation $Y_1, \ldots Y_n$, (i.e. each $Y_i \in [n]$ and they are distinct). For a function $f : S_n \to \mathbb{R}$ we can define a martingale, $X_i$ for $i \in \{0, \ldots n\}$, as $$ X_i = \mathbb{E}[f(Y_1, \ldots Y_n) | Y_1, \ldots Y_i]. $$ In particular $X_n = f(Y_1, \ldots Y_n)$ and $X_0 = \mathbb{E}[f(Y_1, \ldots, Y_n)]$.

To apply the Azuma inequality need to control martingale differences $$ |X_k - X_{k+1}| \leq c_{k+1}. \tag{1} $$ To do this, it is enough to show that for all valid $y_1, \ldots y_k$, and all $y_{k+1}, y_{k+1}'$ we have $$ |\mathbb{E}_Y[f(Y_1, \ldots Y_n) | Y_1 = y_1, \ldots Y_k = y_k, Y_{k+1} = y_{k+1}] \\- \mathbb{E}_Y[f(Y_1, \ldots Y_n) | Y_1 = y_1, \ldots Y_k = y_k, Y_{k+1} = y_{k+1}']| \leq c_{k+1}. \tag{2} $$ Indeed, we can find a bijection between all permuations that start with $(y_1, \ldots y_k, y_{k+1})$ and all permutations that start with $(y_1, \ldots y_k, y_{k+1}')$, s.t. each pair of the permutations differ by a single transposition of form $(k+1, \ell)$ for some $\ell > k+1$ (concretely $\ell$ will be a poisition of $y_{k+1}'$ in the first permutation). This bijection yields to a coupling between permutation $(Y_1, \ldots Y_n)$ conditioned on $Y_1 = y_1, \ldots Y_{k+1} = y_{k+1}$ and a permutation $Y_1', \ldots Y_n'$ conditioned on $Y_1' = y_1, \ldots Y_k' = y_k, Y_{k+1}' = y_{k+1}'$, s.t. with probability $1$ we have $$|f(Y) - f(Y')| \leq c_{k+1},$$ leading to a desired bound (2), and subsequently (1).

Applying Azuma inequality, in this case we can indeed deduce $$ \Pr(|f(Y_1, \ldots Y_n) - \mathbb{E}[f(Y_1, \ldots Y_n)]| > \lambda) \leq \exp\left(-\Omega\left(\frac{\lambda^2}{\sum_i c_i^2}\right)\right),$$ as desired.

$\endgroup$
6
  • $\begingroup$ Hello, Thank you for your reply. Could you please give me the statement that you can deduce? $\endgroup$ Commented Apr 15 at 16:18
  • $\begingroup$ Sure, updated now. $\endgroup$ Commented Apr 15 at 16:32
  • $\begingroup$ I see no reason for (2). As noted in my previous comment, "the local, adjacent-neighbors control seems much limited globally". For (2), you need global control/exchangeability. $\endgroup$ Commented Apr 15 at 17:48
  • $\begingroup$ I'm not sure I understand what you mean by "local control" vs "global control". Under conditions of the OP (deviation of $f$ is bounded under transposition of any two consecutive elements), as I showed in the first paragraph, the desired concentration doesn't hold. Assuming a stronger condition that $|f(\pi) - f(\pi \circ (k, \ell))| \leq O(1)$ for any transposition $(k, \ell)$, I provide a sketch why condition (2) holds, and how this implies the desired concentration. $\endgroup$ Commented Apr 15 at 18:47
  • 1
    $\begingroup$ @JarosławBłasiok : I did not notice that you assumed the stronger, global $(k,\ell)$ condition for all $k<\ell$ instead of the local $(i,i+1)$ condition. $\endgroup$ Commented Apr 15 at 21:07

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.