1
$\begingroup$

I have a question related to this one. $X_i$ are n iid random variables with CDF $1_{[0,+\infty[}(x) \Phi(x)$, i.e. it is a mixture between a folded Gaussian and a delta in $0$, both with weight $1/2$.

I have a vector $a$ with $\parallel a \parallel_ 2 = 1$, and I want to find a function $F : \mathbb{R} \rightarrow [0,1]$ so that $\forall a$ with $\parallel a \parallel_ 2 = 1$, $\forall t \in \mathbb{R}$, $p(\sum_{i=1}^n a_i X_i \leq t) \geq F(t)$, i.e $F$ is the CDF of an upper bound of $\sum_{i=1}^n a_i X_i$ for the stochastic order.

For example, with $Y_i$ iid random variable distributed as a folded Gaussian, using the fact that $p(X_i\leq t)\geq p(Y_i\leq t)$ and the theorem 2 from Yu (2011), we get by conditioning on the numbers of $0$ in $(X_i)$ that

$p(\sum_{i=1}^n a_i X_i \leq t) \geq \left( \frac{1}{2} \right)^n + \sum_{k=1}^n \left( \frac{1}{2} \right)^n C_n^k p(\sum_{i=1}^k \frac{1}{\sqrt{k}} Y_i \leq t)$

Can we find a tighter bound, especially in the tail of the distribution of $\sum_{i=1}^n a_i X_i$ ?

Yu, Y. (2011). Some stochastic inequalities for weighted sums. Bernoulli, 17(3). https://arxiv.org/abs/0910.0544

$\endgroup$
9
  • $\begingroup$ When conditioning, you lose the condition $\|a\|_2=1$. So, I think you should re-examine your lower bound. Also, you don't seem to be actually using the stochastic domination of $X_i$ by $Y_i$. $\endgroup$ Commented Feb 22, 2024 at 21:42
  • $\begingroup$ Thanks @IosifPinelis for your comment. You are right, I do not use the stochastic domination, only the fact that $X_i$ has same distribution than $Y_i$ when conditioned to be > 0. It was for another inequality that I did not write here. However, I do not understand your first comment. I see that when conditioning I loose the condition $||a||_2 = 1$ but I still have $||a||_2 \le 1$ (understood as the $a_i$ where $X_i$ is non-zero) and so the inequality holds. Do you think that the inequality is not correct, or do you mean that I can use a tighter bound than $||a||_2 \le 1$ for the inequality? $\endgroup$ Commented Feb 23, 2024 at 14:11
  • $\begingroup$ $\|a\|_2\le1$ is not good enough: Think e.g. of the case $a=0$. $\endgroup$ Commented Feb 23, 2024 at 15:00
  • $\begingroup$ I am not sure what you mean by "in the tail of the distribution of $\sum a_i X_i$. Do you mean $t$ small? Your inequality seems to me correct (due to monotonicity of the event in }a}) but there is room for improvement in some asymptotics, precisely by considering the norm of }a} after the conditioning. For that, one would need to know the asymptotics you care about. $\endgroup$ Commented Feb 26, 2024 at 6:24
  • $\begingroup$ @oferzeitouni , what I mean is that I am interested in $p(\sum_{i=1}^n a_i X_i \le t)$ close to $1$ (in practice I consider t around 5 to 7). I don't understand how to write a better bound for a after conditioning since I am looking for a bound for all a. I thought about ordering the $a_i$'s, and then ||a|| < sqrt((n-l)/n) (where l is the number of consecutive $X_i=0$ for the largest $a_i$'s) but I don't get a significant numerical improvement. Do you see something better ? $\endgroup$ Commented Feb 26, 2024 at 9:06

1 Answer 1

2
$\begingroup$

This is a partial answer, in the regime that the probability in question is close to $1$. I normalize so that $EY_i=1$. The example $a_i=1/\sqrt{n}$ shows that you need to take $t\geq \sqrt{n}/2+O(1)$ in order to have the probability near $1$). I hope that I got the constants right.

Note that the threshold obtained by the OP solution (normalized so that $E|Y_i|=1$) is $\sqrt{n/2}(1+o(1))$, which is a loss of a factor $\sqrt{2}$ compared to the above example.

Let us first consider the case where all $a_i<\epsilon$. Wlog, we can assume the $a_i$s are nonegative. Let $B=\{i:X_i>0\}$. Then, the expectation of $W:=\sum_{i\in B} a_i^2$ is $1/2$, and the variance is $\sum_{i\in B} a_i^4 /4 \leq \epsilon^2 /4$. In particular, the norm of the surviving indices in this case is with high probability about $(1+o_\epsilon(1))/\sqrt{2}$. So the threshold for probability near $1$ has improved by a factor of $\sqrt{2}$, i.e. we obtain the right constant.

Now, for the case some indices are $>\epsilon$, the situation is better- just consider the residual norm, and get an even better lower bound. I hope this is clear enough.

$\endgroup$
2
  • $\begingroup$ Thank you very much for your answer. From what I understand, I can make some statistics on the $\lVert a \rVert_B$ (the norm of the surviving indices) in order to refine the bound $\lVert a \rVert_B$ < 1 in my formula. If the cardinal of B is k, then I have mean($\lVert a \rVert_B$) = $\sqrt{k/n}$ and var($\lVert a \rVert_B$) $\le$ $k(n-k)/n^2$ if I am not mistaken. But I need to have some bound on the proportion of $\lVert a \rVert_B$ lower than say $\epsilon$. How do I get to that? $\endgroup$ Commented Feb 29, 2024 at 12:53
  • $\begingroup$ You don't need that, since the entries of large a_i contribute order 1, while the t you care about is proportional to sqrt{n}; also since you care only for probability close to 1, you can a-priori only consider $k=n/2+O(\sqrt{n})$. $\endgroup$ Commented Feb 29, 2024 at 19:13

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.