Columbia Technology Ventures

Algorithm for analyzing high-dimensional single cell data

This technology is a method for mapping, graphing, and analyzing high-dimensional single cell data based on multiple parameters associated with the cell.

Unmet Need: Method for simplifying high-dimensional data without loss of structural information

Single-cell analytics, such as flow cytometry and gene expression, reveal many details about diseased and normal tissue, producing high-dimensional biological data sets. Unfortunately, it is challenging to visualize and interpret data in more than three dimensions. Current approaches utilize algorithms to reduce data dimensionality by simplifying or grouping the data into clusters, but this leads to unwanted loss of important information in the data structure.

The Technology: Algorithm to reduce data dimensionality while maintaining structural information

This technology provides techniques for mapping, graphing, and analyzing high-dimensional single cell data based on multiple parameters associated with the cell using a specific algorithm referred to as viSNE. This can be applied as a method for analyzing high-dimensional data by projecting the high-dimensional data onto a low-dimensional map using a nonlinear dimensionality reduction algorithm, such as the t-Distributed Stochastic Neighbor Embedding (t-SNE) algorithm. The resulting map can then be used to analyze developmental systems in healthy and diseased cells and identify rare subpopulations of cells within a sample. This technique can also be used when measuring a variety of cell parameters including the molecular species, such as gene or protein epitope expression, as well as morphological features extracted from techniques such as microscopy.

This technology has been validated using samples of cancer cells.

Applications:

  • Visualization of flow cytometry, gene expression, and other high-dimensional single-cell data
  • Diagnosis of minimal residual disease
  • Identification of rare cell subsets within populations
  • Tissue heterogeneity sampling (tumor progression tracking)

Advantages:

  • Retains high-dimensional data geometry while reducing dimensionality
  • Single map to visualize and interpret high-dimensional data
  • Identifies multivariate relationships within data that are often missed with pairwise cytometry plots
  • Distinguishes cell populations across several existing technologies and biomarkers

Lead Inventor:

Dana Pe’er, Ph.D.

Patent Information:

Patent Pending (US 20180046755)

Related Publications:

Tech Ventures Reference: