Biomedical researchers compare DNA samples by next-generation sequencing to identify individual point mutations and small deletions and insertions (indels) in DNA and RNA sequences. In particular, researchers would like to correlate the change in frequency in the presence of mutations with phenotypic changes. Therefore, to obtain useful sequencing data it is essential to understand the frequency at which certain alleles are represented in the cell population. However, many technical difficulties currently exist, including inhomogeneous samples, existence of subclonal populations, and errors introduced by the sequencing technology itself. Therefore a new method is needed to allow researchers to ascertain the frequency of alleles across a range of samples.
This technology is an algorithm (Statistical frequency Analysis of Sequence Data, SAVI) that can be used to identify particular alleles in a sample and to compare those frequencies with different samples. Sequencing data is compared to a reference genome to identify potential variants at a given position. The algorithm's statistical framework also references prior data, such as the frequency of allelic differences and can account for the quality of the sequencing reaction and the purity of the sample. This information is used to predict the validity of allelic differences and enhance sensitivity. The technology naturally accommodates multiple samples and multiple reference genomes. It has been successfully implemented to identify mutations in Hairy Cell Leukemia and Large B- Cell Lymphoma patients versus healthy controls.
Patent Pending
Tech Ventures Reference: IR 2930