Columbia Technology Ventures

Systems for Modeling and Recognition of Speech

This technology is a series of algorithms capable of transforming sound into a two-dimensional surface approximating the peak in the original spectro-temporal pattern.

Unmet Need: Mathematical characterization of complex sounds for computational speech recognition

As computers and computational devices become increasingly prevalent, the need for vocal recognition and control increases. For computational speech recognition, the computer must accurately distinguish human speech from background noises and recognize individual words. Given the high variability in both background noise and patterns of speech, a single technique for speech recognition is insufficient. The ability to mathematically characterize complex sounds to differentiate between individual words and background noise is necessary for speech recognition and may also be useful for the design of computer-synthesized sounds.

The Technology: Multi-algorithmic approach to transform complex sounds into computable surfaces

The invention is comprised of a series of algorithms for transforming sound into a two-dimensional surface of time and frequency. By utilizing Frequency Domain Linear Prediction (FDLP) and Perceptual Linear Prediction (PLP), complex sounds can be transformed into a two-dimensional surface of time and frequency. The result is a smooth continuous spectral profile at each time step. This technology can be used for numerous applications, including removal of background noise and computational speech recognition.

Applications:

  • Speech recognition
  • Background noise removal
  • Auditory search capability
  • Automated dictation
  • Generation of complex textural sounds for games or virtual reality systems

Advantages:

  • Low Word recognition error rate (<3%)
  • Error rate is at least 20% better than other comparable technologies
  • Algorithms have demonstrated the ability to process complex sounds
  • Processing can be used for efficient encoding and synthesis of textural sounds

Lead Inventor:

Daniel P.W. Ellis, Ph.D.

Patent Information:

Patent Status

Related Publications:

Tech Ventures Reference: