Suppose that for all $n \in \mathbf{N}$, $X_n$ and $Y_n$ are independent random variables with $$X_n \sim \mathtt{Binomial}(n,1-q),$$ and $$Y_n \sim \mathtt{Poisson}(n(q+\epsilon_n)),$$ where $q \in (0,1)$, and $(\epsilon_n)$ is a deterministic sequence such that $\epsilon_n \to 0$ as $n \to \infty$.
Aim:
I am looking for a way to solve the following "signal extraction/estimation" problem, namely:
For a sequence $s_n \geq 0$ with $n s_n \in \mathbf{N}$ and $s_n \to 1$ as $n\to\infty$, show that as $n \to \infty$,
$$\frac{\mathbf{E} [ X_n \mid X_n + Y_n = n s_n ]}{n} = 1 - q + O(|s_n - 1|) + O(\epsilon_n).$$
Heuristic:
Here is why I believe it to be true. We know that $n^{-1} X_n$ and $n^{-1} Y_n$ are both approximately Gaussian and furthermore, if $Z_1, Z_2$ are independent Gaussians with means $\mu_1$ and $\mu_2$, and variances $\sigma_1^2$ and $\sigma_2^2$ respectively, then $Z_1 \mid Z_1 + Z_2 = s$ is also Gaussian and $$\mathbf{E}[Z_1 \mid Z_1 + Z_2 = s] = \mu_1 + \frac{\sigma_1^2}{\sigma_1^2 + \sigma_2^2} ( s - \mu_1 - \mu_2),$$ ie, one apportions the difference between the expectation of the sum and the observed statistic according to the ratio of the variances.
If one naively assumes that this property can be carried over to the limits of $n^{-1} X_n$ and $n^{-1} Y_n$, then we can believe that
\begin{align} \frac{\mathbf{E} [ X_n \mid X_n + Y_n = n s_n ]}{n} &= \mathbf{E} [ n^{-1} X_n \mid n^{-1}X_n + n^{-1}Y_n = s_n ] \\ &\approx 1 - q + \frac{q(1-q)}{q(1-q)+q+\epsilon_n} (s_n - (1-q) - (q + \epsilon_n) \\ &= 1 - q + O(|s_n - 1|) + O(\epsilon_n). \end{align}
Attempt(s):
Exploit local limit theorem: my main attempt has been a brute force approach, trying to prove this directly by approximating the probability mass functions of $X_n$ and $Y_n$ by Gaussian densities using the local limit theorem, ie we can write \begin{equation} \frac{\mathbf{E}[ X_n \mid X_n + Y_n = n s_n ]}{n} = \frac{1}{n} \sum_{k=0}^n k \frac{\mathbf{P}[X_n = k] \mathbf{P}[Y_n = ns_n - k]}{\mathbf{P}[X_n + Y_n = ns_n]}. \end{equation} Each of the probabilities within the sum can be approximated by a Gaussian density with an error term which is $O(n^{-1/2})$ uniformly in $k$. Carrying this through is extremely messy however, and one will need to be extremely careful when being precise about approximating the Riemann sums which will appear with their corresponding integrals.
Try to find relevant tricks/results under theme of "signal extraction/estimation": essentially, the problem here is one of estimating/extracting a signal from an observation with additive independent (and approximately Gaussian) noise. It seems to me that this would be a problem well studied but searches of permutations of my above question yield standard undergraduate results for sums of i.i.d. random variables.
Specific questions:
- Is it possible that there's a clever way to use the approximate Gaussian behaviour of $X_n$ and $Y_n$ to prove this result without the brute force approach of the local limit theorem?
- Are there some key-words that may lead me to similar results in the signal extraction/estimation literature?