Histogram as probability density (Gonzalez & Woods): The normalized histogram p(r_k) = n_k/N approximates the probability density function of pixel (or cell) intensities. This statistical interpretation underpins all threshold-based analysis: Otsu's method treats the histogram as a mixture of two probability distributions and finds their optimal separation. The histogram is therefore not just a visualization tool — it is the empirical estimate of the measurement's statistical distribution.
Histogram equalization: Gonzalez & Woods describe histogram equalization as a transform that spreads the histogram to use all available gray levels. The transform is the cumulative distribution function (CDF): s_k = CDF(r_k). This is relevant to tissue analysis when comparing samples with different staining intensity ranges — equalization normalizes the dynamic range, though it also distorts the proportional relationship between intensity and expression.
Gray levels and noise (Pawley): The number of meaningful gray levels in a histogram depends on SNR: g = 1 + SNR. With 100 photons per pixel, SNR = 10, yielding ~11 distinguishable levels. Below 25 photons/pixel, the histogram has only ~5 meaningful bins. This sets a fundamental floor on the resolution of any histogram-based analysis — if the measurement doesn't have enough precision to fill more than a few bins, the histogram cannot reveal fine population structure.
Bimodality as diagnostic: A truly bimodal histogram indicates two distinct populations — the basis for reliable binary gating. The "dip test" (Hartigan's) provides a formal statistical test for bimodality: if the histogram is significantly bimodal (p < 0.05), the two populations are statistically distinguishable, and a threshold between them is meaningful. If the histogram is not significantly bimodal, forcing a binary gate creates an artificial division of a continuous distribution.
The histogram is the empirical probability distribution of your measurement. Otsu's method and all threshold-based analysis work on this distribution — they try to find natural boundaries between populations. The histogram's resolution (how many meaningful bins it has) depends on the measurement's precision, which depends on how many photons were collected. A well-separated bimodal histogram is the best-case scenario for gating; a featureless unimodal histogram means the marker may not divide cells into meaningful categories.