Here is a revised answer that might be clearer:
You define white clusters but I'll just look at black clusters since that is what your data does, and it implies the other interpretation (counts of monochrome clusters.)
Strictly speaking your claim is not totally accurate. For $p=1-\frac1{10^6}$ and a $1000 \times 1000$ grid one would expect on average one white cell and $999,999$ black. The probabilities to see $0,1,2$ or $3$ white cells are about $36.8\%,36.8\%,18.4\%$ and $6\%$ So the largest cluster of black is $1000000$ or $999999$ a little over $\frac23$ of the time. However, if you make the grid $10^8 \times 10^8$ with that same $p$ I think you would see your effect.
The main effect I want to describe is clear already for a $1 \times n$ rectangle, so let me describe that first:
Suppose you flip a random biased coin that comes up heads with probability $p$ and tails with probability $1-p.$ You ignore the tails but when you get a head you record how long the cluster of heads is. Let $P_k$ be the probability that the next head you get will be the start of a cluster of length $k.$ It is easy to see that $P_k=p^{k-1}q$ so $P_{j+1}=pP_j \lt P_j.$ That is really just a $1 \times \infty$ rectangle. Note: If it is a finite $1 \times N$ rectangle then the chance that all the cells will end up black is $p_N=p^N$ so it is possible that $P_N \gt P_1 \gt P_2 \gt \cdots.$
Turn now to an $n \times n$ board. I will assume $n$ is quite large and ignore effects at the corners and sides.
One comment is that for $p$ large enough (and I think $p \gt 0.5$ is enough) there is usually one huge cluster and an assortment of smaller ones.
Already at $p=0.6$ (on average $600,000$ black cells) your data seems to indicate that the largest cluster (over $100$ trials) was always at least $586630$ (so over $97 \%$) and that the second largest was at most $307.$
At $p=0.54$ , if I read it correctly, out of an expected $540,000$ black cells you never saw less than $495,000$ (so over $90 \% $) in the largest cluster.
In a quick look at $p=0.5$ I did not see the phenomenon but for $p=0.51$ there is a jump from $77,311$ to $284,083.$
So I'll speculate that the larger a partial cluster is (say at least up to $250,000$), the more likely it is to grow a bit more. This tends to spread out the larger sizes leaving no one occurring too often.
Here is a small case: Consider a cell not too near the edges. The probability that it is black in a cluster of size $1$ is $P_1=pq^8.$ There are $8$ ways it could be in a cluster of size $2.$ Half of them (shared side) require $10$ other squares to be white. The other four (shared corner) require $12$ white squares. So the probability to be in a cluster of size $2$ is $P_2=p^2(4q^{10}+4q^{12})$
So $P_2=4p(q^2+q^4)P_1.$ Solving for the maximum ratio we get that $P_2<0.9P_1$ with that bound occurring at about $p=0.27.$
Here is another point of view. Randomly assign the distinct weights $1,2,3,\cdots, 100000$ to the squares and then turn them white to black in that order. So we are gradually raising $p.$ Do this 100 or 1000 times. Usually there will gradually be a few isolated one cell clusters far from each other. Eventually the first multi cell cluster will occur. Probably of size $2$ but maybe $3$ or even $4$. But at that stage there are many single cell clusters. Eventually there will be more cells in multi-cell clusters than in single cell ones. But that might be something like $40\%$ in single cell $30\%$ in double cell and $30\%$ in triple cell. That distribution would have the number of clusters of sizes $1,2,3$ in a ratio of 40:15:10. I will stop there.
DiagonalBSDxyzfiles correspond to the data for $p=x.yz$. For exampleDiagonalBSD025.jpgis the the plot for $p=0.25$ andDiagonalBSD025.txtis the data file for $p=0.25$. $\endgroup$