Columbia Technology Ventures

Pitch-synchronous method for speech encoding and decoding

This technology is a pitch-synchronous parameterization method that allows for more natural-sounding speech in voice transformation and recognition applications.

Unmet Need: Speech parameterization method that accounts for pitch and timbre

Speech parameterization is used to convert an audio signal into a mathematical format that can be extracted for relevant information. Current methods of speech parameterization, including mel-frequency cepstral coefficients and linear predictive codes, are insensitive to variations in tone and pitch and often result in inaccurate or robotic-sounding speech. New parameterization methods are necessary to generate more realistic-sounding speech, especially in languages where pitch contour is essential.

The Technology: Pitch-synchronous speech parameterization method for improved speech identification

This technology is a parameterization method that separates information about timbre and pitch to achieve a pitch-synchronous representation of speech. It can be used for both automatic speech recognition as well as the speech synthesis. This parameterization method accounts for certain unique elements of human speech, such as pitch, when recognizing and transforming speech. Generation of different tones can be achieved, which is particularly important for tonal-based languages such as Mandarin.

Applications:

  • Speech recognition (speech-to-text)
  • Speech synthesis (text-to-speech)
  • Speech coding
  • Generation of natural-sounding speech
  • Robotics

Advantages:

  • Compatible with tonal languages
  • Ability to synthesize natural-sounding speech
  • Can be adapted for any speech database fit for traditional parameterization

Lead Inventor:

C. Julian Chen, Ph.D.

Patent Information:

Patent Status

Related Publications:

Tech Ventures Reference: