Columbia Technology Ventures

Genomic data represented as barcodes, to easily identify evolutionary patterns

Phylogenetic trees and networks are widely used to organize genomic data, but they cannot capture complex evolutionary patterns. This technology uses a mathematical technique called persistence homology to represent multidimensional evolutionary models as barcodes, which can parse out complex patterns that phylogenetic trees would miss. It can detect both vertical and non-vertical evolution, making it a comprehensive tool for interpreting rapidly growing genomic data sets produced by next-gen sequencing technology.

Find evolutionary patterns that are invisible to network and tree representations with this fast and robust multi-dimensional technique.

Complex data sets require an informatics tool that can detect complex patterns. This method can deduce new evolutionary patterns by accurately representing data from non-vertical genetic exchange events such as meiosis, chromosomal crossover, gene conversion, lateral gene transfer, recombination, and re-assortment, as well as complex exchanges between two or more organisms. Persistence homology uses higher-dimensional complexes than trees or networks to model genomes, and determines global properties of the data based on well-defined topological properties of the n-dimensional complex. Evolutionary properties parsed out with this method are displayed as barcodes, an attractive visual representation for this type of data.

Testing on a set of RNA viruses with distinct genetic exchange modes demonstrated the superior performance of this technology to identify complex evolutionary patterns. Tests on influenza A, dengue, West Nile virus, and rabies showed that this method produces patterns equivalent to phylogenetic representations for simple vertical genetic exchange.

Lead Inventor:

Raul Rabadan, Ph.D.

Applications:

  • Trace new evolutionary properties of organisms, even across species.
  • Detect non-vertical genetic exchange (e.g., meiosis, reassortment).
  • Represent large-scale global genomic information.
  • Accurately predict evolutionary pathways.
  • Identify type of evolution.

Advantages:

  • Models vertical and non-vertical evolutionary patterns.
  • Provides information on rates of non-vertical genomic events.
  • Computationally efficient, even for complex data sets.
  • Mathematically profound and robust.
  • Does not assume species or vertical/non-vertical genetic exchange on input data.
  • Visually appealing representation.

Patent information:

Patent Pending

Licensing Status:

Available for licensing and sponsored research support

Tech Ventures Reference: IR CU13117

Related Publications: