The discrete N×N box average is the convolution of the input with the kernel K(i, j) = 1/N² for all (i, j) in [−⌊N/2⌋, ⌊N/2⌋]² and 0 elsewhere. Written out for a 3×3 case at output position (x, y):
O(x, y) = (1/9) · [I(x−1,y−1) + I(x,y−1) + I(x+1,y−1) + I(x−1,y) + I(x,y) + I(x+1,y) + I(x−1,y+1) + I(x,y+1) + I(x+1,y+1)]
Two computational notes worth knowing. First, the box average is separable: the 2D N×N box equals the outer product of two 1D boxes of length N. This means running rows first, then columns, costs O(WHN) instead of O(WHN²). Second, the box average admits a constant-time-per-pixel implementation regardless of N: maintain a running sum that adds the entering column and subtracts the exiting one as the window slides. This brings the cost to O(WH) — independent of window size. Most production implementations use this trick.