Phenotypes | StrataQuest Glossary

View

Definition

A phenotype is a cell's biological identity defined by its combination of marker expression — CD3+CD8+PD-1+ is a different phenotype than CD3+CD8+PD-1−, even though both are cytotoxic T cells. The Phenotypes engine combines gating results from multiple markers into composite cell type definitions, translating raw marker positivity patterns into biologically meaningful categories. Every cell in the dataset receives a phenotype label that captures its full marker expression profile.

Composite Identity

Multi-marker cell type definition

Hierarchical Organization

Phenotype families and subtypes

Gate-Based Construction

Built from individual marker gates

Spatial Analysis Input

Cell types for spatial phenotyping

How It Works

The Phenotypes engine combines multiple gating results into cell type assignments:

Define phenotypes — Each phenotype has a name and a Boolean expression using gate results. Example: "Cytotoxic T" = CD3_positive AND CD8_positive AND CK_negative.
Evaluate — For each cell, all gate results are known (positive/negative per marker). The engine evaluates each phenotype definition against the cell's gate profile.
Assign — The first matching phenotype (in priority order) is assigned. If no defined phenotype matches, the cell is labeled "Unclassified" or assigned to a default category.
Summarize — Population statistics are computed: total count and percentage for each phenotype in each ROI. These summaries are the primary quantitative output for many clinical applications.

Simplified

Phenotypes combine individual marker gates (CD3 positive/negative, CD8 positive/negative, etc.) into composite cell type labels. Each cell's full marker profile determines its phenotype — CD3+CD8+ is a cytotoxic T cell, CK+Ki67+ is a proliferating tumor cell. Population percentages and counts are computed per phenotype per region.

Science Behind It

Classification as partition: MIT's Statistical Models chapter frames classification as partitioning feature space into decision regions. Each phenotype definition creates a region in the multi-dimensional marker space — the set of all cells with the specified marker combination. The phenotype definitions collectively partition the marker space: every cell belongs to exactly one phenotype. Well-designed phenotype panels produce partitions that align with biologically distinct cell populations.

The combinatorial explosion: With n binary markers, there are 2ⁿ possible marker combinations. A 6-marker panel has 64 possible phenotypes; a 10-marker panel has 1,024. In practice, most combinations are biologically meaningless (a cell cannot be CD3+ and CD20+ simultaneously in normal biology) or extremely rare. Effective phenotype design focuses on the 5-15 biologically meaningful combinations rather than trying to characterize all possible ones.

Statistical methods for threshold optimization (Dilbilir): The quality of phenotype assignments depends entirely on the quality of individual marker gates. Each gate threshold determines the boundary between positive and negative for one marker. Dilbilir's statistical framework emphasizes that threshold optimization should minimize classification error across the population — not just separate the most obvious positive from the most obvious negative cells. Cells near the threshold (the "dim positive" or "equivocal" population) are the most error-prone and most important to get right.

Simplified

Phenotypes partition cells into biologically meaningful groups based on marker combinations. A 6-marker panel could theoretically produce 64 phenotypes, but biology restricts the meaningful ones to 5-15 cell types. The accuracy depends on how well each individual marker gate separates positive from negative — the cells near the threshold are where most classification errors occur.

Practical Example

Phenotyping a 6-plex immuno-oncology panel (CD3, CD8, CD20, FOXP3, PD-L1, CK):

Tumor PD-L1+: CK+ AND PD-L1+ (16% of all cells)
Tumor PD-L1−: CK+ AND PD-L1− (34%)
Cytotoxic T: CD3+ AND CD8+ AND CK− (8%)
Helper T: CD3+ AND CD8− AND CK− (6%)
Regulatory T: CD3+ AND FOXP3+ AND CK− (2%)
B cell: CD20+ AND CD3− AND CK− (4%)
Other: remaining cells (30%)

These 7 phenotypes capture the major cell populations relevant to immunotherapy response prediction. The 50/50 split of PD-L1+/− tumor cells and the 4:1 ratio of cytotoxic to regulatory T cells are directly clinically informative.

Simplified

A 6-marker panel produces ~7 meaningful phenotypes capturing the tumor-immune landscape: PD-L1+ and PD-L1− tumor cells, cytotoxic and helper T cells, regulatory T cells, and B cells. The proportions of these phenotypes — especially the balance between effector and regulatory immune cells — predict immunotherapy response.

Connected Terms

Scattergram Category Related
Gates Category Related
Cutoffs Category Related
Spatial Phenotyping Related
Marker-Positive Cell Classification Related
NeuN Neuronal Marker Related
Phenotype Interactions Related
Multiplex Immunofluorescence Related

Explore StrataQuest

StrataQuest

AI-powered tissue analysis software for spatial phenotyping, cell quantification, and proximity mapping.