This technology is a pitch-synchronous signal processing method which can capture relevant speech information for artificial intelligence.
Many artificial intelligence systems rely on inputs in the form of raw speech signals which need to be converted into text for processing by the neural network. Traditional signal processing methods only crudely represent some speech features, losing valuable and important properties along the way. Newer approaches feed raw signal waveforms into large neural networks, but these methods can be computationally prohibitive and prone to noise and irregularity.
This technology is a pitch-synchronous speech processing method for artificial intelligence, separating waveforms into pitch periods and timbre vectors which more closely mimic human hearing patterns. These serve as more complete and accurate representations of speech compared to traditional methods, while being more concise and less noisy than using raw speech signals.
IR CU22076
Licensing Contact: Dovina Qu