Visualization

Histogram

Frequency distribution of a single measurement

View
Definition
A histogram shows the distribution of a single measurement across all cells — how many cells have intensity 50, how many have intensity 100, how many have intensity 200. It reveals the shape of the population: a single peak (one population), two peaks (positive and negative), a long tail (heterogeneous expression), or a plateau (uniform distribution). Histograms are the simplest and most fundamental visualization in quantitative tissue analysis — the starting point for setting gates, evaluating staining quality, and understanding population structure.
Dynamic range - 16bit to 8bit normalization
Video · Primary
Supporting
Detection - Automatic Background Threshold params
Video · Supporting
Detection - Nuclei Segmentation Classic
Video · Supporting
User Interface - Dimensions (8bit FL)
Video · Supporting
Dynamic range - 16bit to 8bit normalization 4m
Video · Supporting
Diagrams - How to create Histograms and Scattergrams
Video · Supporting
Dynamic range - 16bit to 8bit normalization 6m
Video · Supporting
Bimodal histograms threshold cleanly
Two peaks with a valley between them
Spikes at the edges mean saturation
Pixels stuck at 0 or at the maximum
Sparse histograms mean the bit depth isn't earning its keep
Lots of empty bins between the populated ones
Look at the histogram before deciding anything
It's the diagnostic before the prescription

How It Works

The Histogram engine visualizes the distribution of any per-cell measurement:

  1. Measurement selection — Choose any column from the measurement table: marker intensity, area, compactness, derived value.
  2. Binning — The measurement range is divided into bins of equal width. Each bin counts the number of cells with measurement values falling in that range.
  3. Display — Bar height represents the count (or frequency) of cells in each bin. Optional overlays: fitted distributions, gate positions, population coloring by phenotype.
  4. Statistics — Compute and display mean, median, standard deviation, coefficient of variation, percentiles, and modality (number of peaks).
Simplified

A histogram counts how many cells have each measurement value and displays the distribution as bars. The shape reveals population structure — peaks are cell populations, valleys are natural classification boundaries, and the width shows measurement variability.

Science Behind It

Histogram as probability density (Gonzalez & Woods): The normalized histogram p(r_k) = n_k/N approximates the probability density function of pixel (or cell) intensities. This statistical interpretation underpins all threshold-based analysis: Otsu's method treats the histogram as a mixture of two probability distributions and finds their optimal separation. The histogram is therefore not just a visualization tool — it is the empirical estimate of the measurement's statistical distribution.

Histogram equalization: Gonzalez & Woods describe histogram equalization as a transform that spreads the histogram to use all available gray levels. The transform is the cumulative distribution function (CDF): s_k = CDF(r_k). This is relevant to tissue analysis when comparing samples with different staining intensity ranges — equalization normalizes the dynamic range, though it also distorts the proportional relationship between intensity and expression.

Gray levels and noise (Pawley): The number of meaningful gray levels in a histogram depends on SNR: g = 1 + SNR. With 100 photons per pixel, SNR = 10, yielding ~11 distinguishable levels. Below 25 photons/pixel, the histogram has only ~5 meaningful bins. This sets a fundamental floor on the resolution of any histogram-based analysis — if the measurement doesn't have enough precision to fill more than a few bins, the histogram cannot reveal fine population structure.

Bimodality as diagnostic: A truly bimodal histogram indicates two distinct populations — the basis for reliable binary gating. The "dip test" (Hartigan's) provides a formal statistical test for bimodality: if the histogram is significantly bimodal (p < 0.05), the two populations are statistically distinguishable, and a threshold between them is meaningful. If the histogram is not significantly bimodal, forcing a binary gate creates an artificial division of a continuous distribution.

Simplified

The histogram is the empirical probability distribution of your measurement. Otsu's method and all threshold-based analysis work on this distribution — they try to find natural boundaries between populations. The histogram's resolution (how many meaningful bins it has) depends on the measurement's precision, which depends on how many photons were collected. A well-separated bimodal histogram is the best-case scenario for gating; a featureless unimodal histogram means the marker may not divide cells into meaningful categories.

Practical Example

Evaluating Ki-67 staining quality before analysis:

  • Good staining: Bimodal histogram with clear valley at intensity 80 — negative population peaks at 30, positive population peaks at 150. Gate at 80 cleanly separates the two.
  • Poor staining: Unimodal histogram peaking at 50 with a long right tail — no clear positive population. Forcing a gate at any point divides a continuous distribution arbitrarily.
  • Too much background: Bimodal but with the negative peak at 120 (high background) and positive peak at 180 — populations overlap extensively. Background Removal needed before meaningful gating.

The histogram reveals the problem before you invest time in analysis. A quick histogram check is the single best quality control step in any tissue analysis workflow.

Simplified

Check the histogram before analyzing. Two clear peaks with a valley between them? Good staining — set the gate in the valley. One broad peak with no valley? Problem — the marker may not provide useful classification. High background shifting everything bright? Apply Background Removal first. The histogram is your first and most important quality check.

Connected Terms

Share This Term
Term Connections