ScientiaLux
strataquest Glossary Phenotypes
Analysis Tool

Phenotypes

Defined cell populations by marker expression patterns

View
Definition
A phenotype is a cell's biological identity defined by its combination of marker expression — CD3+CD8+PD-1+ is a different phenotype than CD3+CD8+PD-1−, even though both are cytotoxic T cells. The Phenotypes engine combines gating results from multiple markers into composite cell type definitions, translating raw marker positivity patterns into biologically meaningful categories. Every cell in the dataset receives a phenotype label that captures its full marker expression profile.
Composite Identity
Multi-marker cell type definition
Hierarchical Organization
Phenotype families and subtypes
Gate-Based Construction
Built from individual marker gates
Spatial Analysis Input
Cell types for spatial phenotyping

How It Works

The Phenotypes engine combines multiple gating results into cell type assignments:

  1. Define phenotypes — Each phenotype has a name and a Boolean expression using gate results. Example: "Cytotoxic T" = CD3_positive AND CD8_positive AND CK_negative.
  2. Evaluate — For each cell, all gate results are known (positive/negative per marker). The engine evaluates each phenotype definition against the cell's gate profile.
  3. Assign — The first matching phenotype (in priority order) is assigned. If no defined phenotype matches, the cell is labeled "Unclassified" or assigned to a default category.
  4. Summarize — Population statistics are computed: total count and percentage for each phenotype in each ROI. These summaries are the primary quantitative output for many clinical applications.
Simplified

Phenotypes combine individual marker gates (CD3 positive/negative, CD8 positive/negative, etc.) into composite cell type labels. Each cell's full marker profile determines its phenotype — CD3+CD8+ is a cytotoxic T cell, CK+Ki67+ is a proliferating tumor cell. Population percentages and counts are computed per phenotype per region.

Science Behind It

Classification as partition: MIT's Statistical Models chapter frames classification as partitioning feature space into decision regions. Each phenotype definition creates a region in the multi-dimensional marker space — the set of all cells with the specified marker combination. The phenotype definitions collectively partition the marker space: every cell belongs to exactly one phenotype. Well-designed phenotype panels produce partitions that align with biologically distinct cell populations.

The combinatorial explosion: With n binary markers, there are 2ⁿ possible marker combinations. A 6-marker panel has 64 possible phenotypes; a 10-marker panel has 1,024. In practice, most combinations are biologically meaningless (a cell cannot be CD3+ and CD20+ simultaneously in normal biology) or extremely rare. Effective phenotype design focuses on the 5-15 biologically meaningful combinations rather than trying to characterize all possible ones.

Statistical methods for threshold optimization (Dilbilir): The quality of phenotype assignments depends entirely on the quality of individual marker gates. Each gate threshold determines the boundary between positive and negative for one marker. Dilbilir's statistical framework emphasizes that threshold optimization should minimize classification error across the population — not just separate the most obvious positive from the most obvious negative cells. Cells near the threshold (the "dim positive" or "equivocal" population) are the most error-prone and most important to get right.

Simplified

Phenotypes partition cells into biologically meaningful groups based on marker combinations. A 6-marker panel could theoretically produce 64 phenotypes, but biology restricts the meaningful ones to 5-15 cell types. The accuracy depends on how well each individual marker gate separates positive from negative — the cells near the threshold are where most classification errors occur.

Practical Example

Phenotyping a 6-plex immuno-oncology panel (CD3, CD8, CD20, FOXP3, PD-L1, CK):

  • Tumor PD-L1+: CK+ AND PD-L1+ (16% of all cells)
  • Tumor PD-L1−: CK+ AND PD-L1− (34%)
  • Cytotoxic T: CD3+ AND CD8+ AND CK− (8%)
  • Helper T: CD3+ AND CD8− AND CK− (6%)
  • Regulatory T: CD3+ AND FOXP3+ AND CK− (2%)
  • B cell: CD20+ AND CD3− AND CK− (4%)
  • Other: remaining cells (30%)

These 7 phenotypes capture the major cell populations relevant to immunotherapy response prediction. The 50/50 split of PD-L1+/− tumor cells and the 4:1 ratio of cytotoxic to regulatory T cells are directly clinically informative.

Simplified

A 6-marker panel produces ~7 meaningful phenotypes capturing the tumor-immune landscape: PD-L1+ and PD-L1− tumor cells, cytotoxic and helper T cells, regulatory T cells, and B cells. The proportions of these phenotypes — especially the balance between effector and regulatory immune cells — predict immunotherapy response.

Connected Terms

Share This Term
Term Connections