Coded Image | strataquest

View

Definition

A coded image is the result of detection — a map where every pixel that belongs to a detected object is labeled with that object's unique integer ID, and all background pixels are zero. Think of it as a census: instead of seeing brightness, you see identity. Pixel value 47 means "this pixel belongs to nucleus #47." This labeling scheme is what makes it possible to track individual cells through every downstream analysis step.

Identity Map

Every object gets a unique number

Measurement Anchor

Links objects to their measurements

Cross-Engine Reference

Shared identity across all engines

Connectivity Rules

4-connected vs. 8-connected labeling

How It Works

Coded image generation follows detection (thresholding, watershed, or deep learning). Once the binary mask of foreground pixels exists, connected component labeling assigns unique integer IDs:

Scan — The algorithm scans the binary mask row by row. At each foreground pixel, it examines already-visited neighbors.
Label — If no foreground neighbors exist, a new label is assigned. If one neighbor is foreground, its label is copied. If multiple differently-labeled neighbors are foreground, one label is chosen and an equivalence is recorded.
Resolve — After scanning, equivalences are resolved so that each connected component has a single unique label.

The result is a 32-bit integer image where pixel values range from 0 (background) to N (where N is the total number of detected objects). This image can be visualized with pseudo-color (each label mapped to a distinct color) for quality inspection.

Simplified

After detection finds which pixels are "foreground" (nuclei, tissue, etc.), the coded image labels each separate group of connected pixels with a unique number. Nucleus #1 gets value 1, nucleus #2 gets value 2, and so on. Background stays at zero. This numbering system lets every downstream analysis refer to specific cells by their ID.

Science Behind It

Connected component labeling (CCL): The fundamental algorithm behind coded images comes from image topology. Two foreground pixels belong to the same object if and only if a path of foreground pixels connects them. The classic two-pass algorithm scans the image once to assign provisional labels and record equivalences, then scans again to replace provisional labels with their final equivalents. For an M×N image, this runs in O(M×N) time — linear in the number of pixels.

The connectivity dilemma: Consider two foreground pixels touching only at a diagonal corner. Are they the same object? In 4-connectivity, no — only orthogonal neighbors (up, down, left, right) count. In 8-connectivity, yes — diagonal neighbors also count. Neither answer is universally correct. For nuclear detection, 8-connectivity is typical because nuclei are roughly circular and diagonal adjacency usually indicates the same nucleus. For detecting thin structures like membranes, 4-connectivity may be more appropriate to avoid merging parallel structures.

Why integer labels, not colors? A coded image stores identity, not appearance. Using 32-bit integers allows up to ~4 billion unique labels per image — far more than enough for any tissue section. The integer representation enables direct lookup: "give me all measurements for object #47" is a simple equality test on the coded image. Colors would require collision-free color assignment and lossy matching.

From topology to biology: The coded image bridges the gap between "what does the tissue look like" (the fluorescence image) and "what does the tissue contain" (individual cells with measurements). This is conceptually similar to the difference between a photograph of a crowd and a census — both describe the same scene, but the census identifies each individual and tracks their attributes.

Simplified

Connected component labeling scans the detection result and groups touching pixels together. Each group gets a unique number. The key decision is connectivity: should diagonally-touching pixels count as the same object? For round objects like nuclei, including diagonals (8-connectivity) usually makes sense. The integer labeling system lets the software track up to billions of individual objects per image.

Practical Example

After Nuclei Detection runs on a DAPI-stained tissue section containing 12,000 cells:

The coded image contains values from 0 (background) to 12,000
Each nucleus's pixels carry its unique label (e.g., all pixels of nucleus #5,847 have value 5847)
Standard Measurements reads this coded image plus the original fluorescence channels to compute per-nucleus area, mean intensity in each channel, shape descriptors, and centroid coordinates
Phenotyping reads the same coded image to classify each nucleus based on its measured biomarker expression
Spatial analysis reads centroids from the coded image to compute distances between cells, density maps, and neighborhood compositions

The coded image is the single source of truth for cell identity throughout the entire analysis pipeline.

Simplified

When 12,000 nuclei are detected, the coded image labels them 1 through 12,000. Every subsequent step — measuring biomarker intensity, classifying cell types, analyzing spatial relationships — uses this numbering to know which cell is which. Change the detection, and every downstream result updates accordingly.

Connected Terms

Deep Learning Nuclei Detection Category Related
Engines Category Related
Grow Category Related
Layers Category Related
Manual Correction Category Related
Membrane Category Related
Nuclei Detection Category Related
Remove Objects Category Related