Data mining algorithms and gene sequencing technologies allow scientists to build and analyze large data sets to identify the groups of genes associated with a given disease. However, existing algorithms cannot map specific genes to each of the specific phenotypic or biological anomalies that mark a given disease. This technology is an algorithm that identifies metagenes – linear combinations of individual genes – that are biomarkers for the specific underlying biological mechanisms of a disease. This algorithm is a powerful tool for mining large, publically available gene expression data sets and has been used to identify breast cancer prognostic biomarkers. This information can in turn be used to improve personalized diagnosis, prognosis, and treatment for cancer patients.
This technology consists of an iterative algorithm that starts at a seed gene and converges on a metagene, which serves as a reliable biomarker for a specific biological mechanism that is characteristic of a larger disease. While existing algorithms can identify combinations of genes that represent a combination of phenotypes present in a given disease, this technology isolates the specific genes that represent a single, specific disease phenotype (e.g., cell transdifferentiation or the presence of an amplicon). As such, this technology may more precisely identify specific genetic biomarkers of anomalous biological mechanisms. The metagenes derived by this process may be used to develop more accurate tools for cancer diagnosis and treatment. Additionally, this technology may further clarify the biological mechanisms underlying such diseases and enable the more efficient discovery of relevant therapeutic targets.
The algorithm was validated using nearly 2,000 breast cancer samples and its predictions were shown to outperform those from current commercially available breast cancer genetic kits.
Patent Pending (US 20160312289)
Tech Ventures Reference: IR CU14254