Columbia Technology Ventures

Next-generation speech enabled browsing: Speech synthesis and voice control for natural and accessible website navigation

This technology, "Hyper Speech Markup Language", seamlessly integrates speech enabled web content with existing visual pages. This technology improves web accessibility from hands-free mobile devices to assistive technology through extending voice commands beyond conventional speech-to-text and text-to-speech approaches. Speech- and audio-based user interfaces developed with this technology invite users to speak about page content, with common-language phrases and questions in place of required keywords or specific link titles. Voice-user interfaces (VUIs) developed with this technology complement visual webpages. They are dynamically generated on each page, and tailor the speech recognition and speech synthesis to the current page's unique content. The audio component can be invisible to non-voice-command users, and web developers can choose the level of speech-enabled design to suit their needs.

Speech synthesis and speech control provide complementary, specifically tailored web content with little overhead

A speech recognition grammar is embedded in webpages using this technology's "hyper-speech" links. But Automated Speech Recognition (ASR) programming is not required to build flexible, sophisticated websites using this technology's audio and speech integration. It requires essentially no bandwidth, because binary code is not transferred between the browser and the server. And it is compatible with popular browsers -- no additional software or plugins required! This scheme represents a breakthrough in assistive technology, in addition to providing an additional interactive dimension for small screens and hands-free applications. Speech enabled website design, with this technology, can provide an equally vibrant user experience for users with impaired vision or dexterity.

Lead Inventor:

Michael Charney

Applications:

  • Assistive technology and accessible web browsing for visually and mobility-impaired users
  • Multimodal user interaction combining visual and voice elements
  • Hands-free interaction with web-enabled devices such as TVs, entertainment systems, or even automobiles
  • Hands-free browsing on tablets and mobile devices
  • Speech channel extends content beyond limitations of small mobile device screens

Advantages:

  • Instantly compatible with most popular web browsers
  • ASR programming not required to build fully voice enabled website
  • Low transmission bandwidth
  • No external program or plugin needed to enable voice-user interface (VUI)
  • Completely transparent to visual content, if desired; easy to integrate into existing company intranets and consumer-facing websites

Patent Information:

Tech Ventures Reference: IR M01-011, IR 1489