Start with the decomposition. For any image I and any low-pass operator L, you can write I = L(I) + (I − L(I)) — the image equals its low-frequency part plus its high-frequency residual. This is an identity for any L; the choice of L determines what "low frequency" means. For Unsharp, L is convolution with the Gaussian G_σ(x, y) = (1/2πσ²) · exp(−(x² + y²)/(2σ²)).
The Unsharp output is:
O = I + α · (I − G_σ * I) = (1 + α) · I − α · (G_σ * I)
This is a linear filter — specifically a convolution with the kernel K = (1 + α)δ − α · G_σ, where δ is the unit impulse. The kernel sum equals (1 + α) · 1 − α · 1 = 1 (using ∑G_σ = 1 since the Gaussian is normalized). Brightness is preserved.
In the frequency domain, taking the Fourier transform of both sides:
Ô(ω) = (1 + α − α · exp(−σ²ω²/2)) · Î(ω)
This frequency response is 1 at DC (ω = 0), rises smoothly to 1 + α as ω grows. The half-rise point is around σω ≈ √(2 · ln 2) ≈ 1.18, i.e., spatial frequencies with period roughly 5.3σ. Below that period, frequencies pass through nearly unmodified; above it, they get progressively amplified up to the ceiling of 1 + α.
The overshoot at a step edge has an analytical form. For an idealized step I(x) = a + (b−a)·H(x) (H = Heaviside step), the Gaussian-blurred version is L(x) = a + (b−a)·Φ(x/σ) (Φ = standard normal CDF). The high-pass I − L peaks at x = 0 on the bright side and dips at x = 0 on the dark side; the peak amplitude is approximately 0.4(b−a)/σ for the Gaussian. After multiplication by α, the overshoot on the output is roughly α · 0.4(b−a)/σ — so for fixed α, smaller σ produces more dramatic but more localized overshoot. This is why σ controls the character of sharpening, not just the amount.