F - Socks 4 Editorial by en_translator
Add \(1\) to \(A_C\) so that \(A_i\) represents the number of socks with color \(i\), including the one he initially holds. We may reassign the indices of colors so that \(A_1 \leq A_2 \leq \ldots \leq A_N\).
First, let us consider which sock to put back to the drawer. If he has socks with colors \(x\) and \(y\), it is better to maintain the sock with the color such the number of socks with that color are larger. Thus, if \(x < y\), we may assume that he put back the sock of color \(x\) to the drawer and maintain that of color \(y\). (Note that even if there is the same number of socks of color \(x\) and of color \(y\), we apply a tie-break by the indices.)
This observation allows us to perform a DP (Dynamic Programming).
Let \(dp_i\) be the expected number of times to take out a sock when he holds a sock of color \(i\) outside the drawer. Let \(S \coloneqq \sum A_i - 1\), the number of socks in the drawer. We derive an equation based on the color of the next sock to draw. If he takes out a sock of color \(j\):
- if \(i > j\): He maintains the sock of color \(i\). The contribution to the expected value is \(\frac {A_jdp_i}{S}\).
- if \(i = j\): The operation terminates, and the contribution is \(0\).
- if \(i < j\): he maintains the sock of color \(j\). The contribution is \(\frac {A_jdp_j}{S}\).
Therefore, we have \(dp_i = 1 + \displaystyle\sum_{j = 1}^{i-1} \frac{A_jdp_i}{S} + \sum_{j = i + 1}^{N} \frac{A_jdp_j}{S}\).
By transposing terms, we obtain \((1 - \displaystyle\sum_{j = 1}^{i-1} \frac{A_j}{S}) dp_i = 1 + \sum_{j = i + 1}^{N} \frac{A_jdp_j}{S}\), which can be evaluated in descending order of \(i\) in a total of \(O(N^2)\) time; however, this computation process is not fast enough.
However, the bottleneck of this computation is \(\displaystyle\sum_{j = 1}^{i-1} \frac{A_j}{S}\) and \(\displaystyle\sum_{j = i + 1}^{N} \frac{A_jdp_j}{S}\), which can be successively computed by differential updates. Specifically, if those values evaluate to \(X\) and \(Y\) when computing \(Y\), we may subtract \(\frac{A_{i - 1}}{S}\) from \(X\) and add \(\frac{A_idp_i}{S}\) to \(Y\) before computing \(dp_{i-1}\).
Overall, the DP can be evaluated in \(O(N)\) time. With the dependence on the modulus \(P = 998244353\), it costs \(O(N \log P)\) or \(O(N + \log P)\) time.
In the solution above, the constraint \(A_i \leq 3000\) is only used so that \(S\) cannot be a multiple of \(998244353\), but this smallness of \(A_i\) can also be used for optimization too.
If you define \(dp_i\) as the expected number of operations to terminate the procedure if he holds a sock of color such that there are \(i\) socks with that color, then the number of states of the DP becomes \(O(\max A_i)\). In this case, even without the differential updates like above, the computational cost is \(O((\max A_i)^2)\), which is fast enough.
posted:
last update: