Introducing Whisper-Medusa: our 50% faster open-source AI model

The best kind of artificial intelligence (AI) tools are those that are accessible and accurate for all. At aiOla, we’ve been hard at work to create the best available automatic speech recognition model, and now, we’re excited to announce that we’ve released our open-source AI model, Whisper-Medusa. 

Whisper-Medusa outperforms OpenAI’s Whisper by operating 50% faster with no loss in performance. Adding speed while maintaining high levels of accuracy was made possible by the way our model can predict tokens. A token is a unit of data that an algorithm processes. OpenAI’s Whisper model predicts one token at a time, whereas aiOla’s Whisper-Medusa predicts ten at a time, thereby expediting speech prediction speed by 50% and generation runtime, especially for long-form audios. aiOla currently offers Whisper-Medusa as a 10-head model, with future plans to release a 20-head version with equivalent accuracy.

Given this feat, Whisper-Medusa’s model’s weights and code are available on Hugging Face and GitHub. “Creating Whisper-Medusa was not an easy task, but its significance to the community is profound,” says Gill Hetz, VP of Research at aiOla.

Based on  a multi-head attention architecture (hence the name), Whisper-Medusa is trained using weak supervision. In this process, the main components of OpenAI’s Whisper are initially frozen. At the same time, additional parameters are being trained. This training process involves using Whisper to transcribe audio datasets and employing these transcriptions as labels to train Medusa’s additional token prediction modules.

The Business Advantage

While we’ve covered the technical specifications, let’s consider why this is such big news for all businesses looking to accomplish more with the use of speech technology.  aiOla’s AI technology empowers frontline workers to fulfill critical operations with accuracy, speed, and insights. No matter what industry you work in, aiOla’s back-end system, called aiOla Jargonic, can automatically transform your paper-based and manual processes into digital workflows. It’s as easy as uploading a photo or file of your existing processes. Without disruptions or any adjustments, aiOla will digitize the workflow

aiOla’s system is able to understand your business-specific jargon in real-time, with no prior re-training or coding necessary. Additionally, Whisper-Medusa is improving performance and reducing latency (delays). Since a bulk of language that is used in businesses to complete processes is nuanced and specific vocabulary, it requires a bespoke technology to comprehend. aiOla delivers a tailored solution for every business, in any industry. 

Frontline workers get to leverage aiOla’s intuitive app, aiOla Interactive, to complete processes by voice or touch. With speech-enabled process completion, aiOla opens the door to something you’d otherwise miss out on as workers complete their tasks– captured, valuable data and insights. aiOla takes unstructured speech data and converts it into usable information so that your business can act proactively and make informed decisions to optimize efficiency, cut costs, and improve resource allocation, for example. 

Not only does aiOla understand business-specific jargon, but it can also comprehend over 100 languages in any accent and in any acoustic environment. This opens the door for use cases across every industry, including: aviation, food manufacturing, logistics and warehousing, healthcare, and more. 

A Giant Leap Forward 

With aiOla’s Whisper-Medusa businesses can take advantage of a speech recognition model that is faster at understanding language with 95%+ accuracy.

With both speed and accuracy at your fingertips, aiOla empowers frontline workers to accomplish more, in less time, with zero interruption to existing processes.
Want to get started using aiOla?
Navigate here to find the open-source files. 

Jolene Amit
Author
Jolene Amit
Jolene Amit is a distinguished B2B tech marketing professional with over 16 years of experience and a proven track record of driving growth and success in the technology sector. Currently serving as the Chief Marketing Officer at aiOla, Jolene brings a wealth of expertise and strategic vision to the company.
Pen