Posted on Jan 18, 2020

Calculating mean iteratively

Recently I ran into a situation, where I wanted to calculate a mean of an unknown size set. My first naive idea was to calculate the average value between current mean and the new value:

xs = [3, 7, 6] # Assuming we don't know the length  mean = x[0] n = 1 while n < len(xs): mean = (mean + xs[n])/2 n += 1

Which quickly reveals to be simply wrong:

(3 + 7 + 6)/3 = 5.3333 ((5/2 + 7/2)/2 + 6/2) = 5.5

In order to find the actual relation between current mean and the i'th value, I started comparing mean from 2 and the mean from 3 values:

(3 + 7)/2 = 3/2 + 7/2 (3 + 7 + 6)/3 = (3 + 7)/3 + 6/3

From here, it's possible to rewrite mean from 3 values in terms of mean from 2 values:

(3 + 7)/3 + 6/3 = = (3 + 7)/2 * 2/3 + 6/3 = = (3/2 + 7/2)*2/3 + 6/3

Where I noticed the pattern:

mean(i) = mean(i-1) * (i-1)/i + x(i-1)/i

Which gives us the correct algorithm for iterative calculation of the mean:

xs = [3, 7, 6] # Assuming we don't know length  mean = xs[0] n = 1 while n < len(xs): n += 1 mean = mean*(i-1)/i + xs[i-1]/i

While I know that in this example iterative calculation is unnescessary, I found this real handy for implementing a segment growing algorithm, where I decide what pixels to add to the segment based on current segment's mean value.

Top comments (1)

Paddy3118 • Jan 19 '20 • Edited

The normal way is to keep a count of, and the sum of, the numbers so far. The sum divided by the count is the mean at any point
Further statistics can calculate other values in a similar way, such as standard deviations.