This technology is a pitch-synchronous parameterization method that allows for more natural-sounding speech in voice transformation and recognition applications.
Speech parameterization is used to convert an audio signal into a mathematical format that can be extracted for relevant information. Current methods of speech parameterization, including mel-frequency cepstral coefficients and linear predictive codes, are insensitive to variations in tone and pitch and often result in inaccurate or robotic-sounding speech. New parameterization methods are necessary to generate more realistic-sounding speech, especially in languages where pitch contour is essential.
This technology is a parameterization method that separates information about timbre and pitch to achieve a pitch-synchronous representation of speech. It can be used for both automatic speech recognition as well as the speech synthesis. This parameterization method accounts for certain unique elements of human speech, such as pitch, when recognizing and transforming speech. Generation of different tones can be achieved, which is particularly important for tonal-based languages such as Mandarin.
IR CU16133
Licensing Contact: Dovina Qu