FOV Invests in ai|coustics

Why voice is an essential component of spatial computing

Dave Haynes
March 27, 2024

9 years ago I wrote a blog post about Innovation in Sound and wrote:

“We interact with sound in an increasing number of ways in our daily lives. I believe that technical advances in our ability to capture, process, manipulate, analyse and share it, are opening up countless opportunities to innovate and build new businesses that previously weren’t possible.”

If you replace ‘sound’ for ‘voice’ the same statement holds true. Add new methods of generative AI into the mix and we’re seeing a new renaissance in voice-related startups.

One such startup is our newest portfolio company ai-coustics, their core mission is to make every digital interaction, whether on a conference call, consumer device or casual social media video, as clear as a broadcast from a professional studio.

Demo here!

So we’re thrilled to announce of participation in the company’s €1.6M seed round, alongside our friends at Connect Ventures, Inovia Capital and angels such as Ableton’s Jan Pohl and Graphcore’s Nigel Toon.

Look forward to several products launching from the company in the upcoming weeks:

➡️ An upgraded web application, making professional-grade speech quality accessible for content creators

➡️ Next-generation generative speech enhancement models beyond noise suppression

➡️ An API and SDK designed to integrate speech clarity into applications, devices and workflows

Why is This Important?

There are several reasons ai|coustics and ‘voice’ fit into our evolving thesis around investing into the next generation of computing that is more intelligent, immersive and spatial.

Voice is Everywhere.

Whether it’s watching TV, listening to podcasts, or sitting on conference calls…

Poor audio quality is painful.

We’ve all strained to make out the person on the other end of the line. But audio problems go deeper still, muddling clarity, hindering accessibility, and stifling engagement across our universally vocal digital landscape.

Voice is everywhere. So it is a big challenge.

ai|coustics is pioneering a solution that imagines every call, podcast or social media post sounding like it was produced in the acoustically perfect studio of a leading broadcaster. By taking on low speech quality in digital communication ai|coustics is developing Generative Audio AI technology that goes far beyond noise suppression and will improve speech clarity in a huge range of products.

Voice is The Next Interface.

While the market for clearer audio is huge, we’re also excited for future use cases (did somebody say ‘always-on augmented hearing’?). There are massive, emerging opportunities for startups in this space…

Speaking to our devices has become normalised.

And those devices are no longer just speakers and phones. They are car systems, smart glasses, wearable AI assistants.

Voice is becoming an increasingly important part of how we interact with the next generation of devices and behaviour has already been normalised for younger generations through Alexa and Siri.

The rise of Generative AI and LLMs is propelling voice interaction to the forefront of human-computer interaction

These advancements are enabling computers and their applications, to understand and respond to our natural language with unprecedented sophistication. So voice will also become a native part of many more day to day experiences and apps — from language learning (see Edailabs) to customer support to NPC’s in our favourite games (see Iconic).

Already seamlessly integrated into our daily lives through companions like Alexa and Siri, advancements in Generative AI and Large Language Models (LLMs) opens the door for voice to be the new interface of human-computer interaction - language learning platforms, customer support systems, or even interactive NPCs in our favourite games.

With Jarvis-or-Her-like interactions on the horizon - where we engage devices while walking around the house, opening the fridge or working out - devices will understand and respond to our natural language, and will be required to compute our voices in many different scenarios and situations.

These advancements will make voice and audio tech a foundational layer in the next generation of computing.

Backing European Tech Talent at The Forefront

Asides from the opportunity, we were blown away by the ai|coustics Berlin-based team.

Fabian Seipel is an audio engineer by training and co-founded the company alongside Corvin Jaedicke, a lecturer in machine learning at the Technical University of Berlin, in 2021.

We’re excited to see the company as it takes its place alongside a number of other hugely exciting speech-orientated startups coming out of Europe, from ElevenLabs (now valued at >$1bn) to Speechly (voice moderation startup acquired by Roblox).

We look forward to working with the company as they start their journey to bring high quality speech to everyone!

Check out the ai|coustics demo here!