Columbia Technology Ventures

Pathogenicity Database for Identification of Disease-Causing Genetic Variations

To request an academic license to download and use this database, please click on "Express Licensing" above, create an account/log in if you do not have an account/are not logged in, then return to this page, and click Express Licensing again. For a commercial license, please contact techventures@columbia.edu.

This technology is a database of transcript-inferred pathogenicity (TraP) scores that evaluate a single nucleotide variant’s ability to cause disease by affecting splicing and transcription.

Unmet Need: Reliable means of correlating genetic mutations with pathogenicity

Understanding the genetic variants responsible for disease development and progression is important to developing targeted treatments and cures. However, leading methods for evaluating genetic substitutions, such as CADD and GERP, often exclude variants that do not change the amino acid sequence, such as synonymous and intronic genetic variations, resulting in overlooked non-coding variants that can potentially have deleterious effects on a transcript through the regulation of splicing or transcription. As such, there is a need for a method that accurately identifies non-coding variants and correlates them with pathogenicity to improve understanding of the genetic mechanisms behind disease.

The Technology: TraP scores predict the pathogenicity of synonymous and intronic variants

This technology is a database that contains a calculated transcript-inferred pathogenicity (TraP) score for all possible single nucleotide variant’s in human protein-coding genes. This technology analyzes the effects of single genetic variants on protein transcription and thereby detects the introduction of new splice sites and the changing of exons expressed in cells. By using features that specifically capture transcript effects and training on known synonymous pathogenic variants, TraP specifically targets indirect regulatory effects that cause disease. Additionally, TraP allows for the inclusion of these sites in gene discovery and diagnostic sequencing efforts, and has been pre-computed for ~3.9 billion genic substitutions with very high accuracy. Consequently, TraP is a powerful tool for clinical DNA sequence interpretations, diagnostic assay development, and identification of risk factors for many prevalent and debilitating diseases.

TraP has been used to identify risk factors in diseases such as ALS, epilepsy, Parkinson’s disease and schizophrenia.

Applications:

  • Identification of non-coding variants involved in disease
  • Identification of risk factors for disease development
  • Development of sequencing-based diagnostics
  • Enhanced in-utero screening for lethal and devastating diseases
  • Portable and handheld genomic analysis tools
  • Development of new therapies that target non-coding variants

Advantages:

  • High accuracy (92%)
  • High specificity (>97%)
  • Pre-computed for ~3.9 billions genetic substitutions
  • Able to identify synonymous and intronic genetic variations
  • Capable of scoring up to one hundred thousand variants in a single evaluation

Lead Inventor:

Sahar Gelfman, Ph.D.

Related Publications:

Tech Ventures Reference: