• Viewpoints
  • Posts
  • What is the Significance of GPT-4o for Spatial Computing?

What is the Significance of GPT-4o for Spatial Computing?

As Generative AI Funding Drops, GPT-4o Drives Spatial Computing Forward

Welcome to this month’s viewpoints, news and investment activity covering the Metaverse, XR and Spatial Computing - brought to you by FOV Ventures.

This Month:

  • What do Recent AI Advancements mean for XR?

  • Find out What our Portfolio is up to.

  • New FOV job-board roles.

Dave Haynes, Petri Rajahalme and Sointu Karjalainen (FOV Ventures)

This past month was a big one in the world of AI, with both OpenAI and Google dropping significant updates and ones that have a big impact with our own focus areas.

At FOV, we’re investing in the future of computing, and believe we are seeing the sprouts of the next era of human-computer interaction.

While the rate of compute and AI development over the last 10 years has been lightning fast, spatial hardware has marched steadily forwards. And we are now beginning to hit an increasingly significant inflection point between AI and Spatial Computing.

AI’s Rapid Development Cycles

On the surface, the ‘Chat GPT-4o’ updates from OpenAI might appear trivial, but several of the updates are important improvements for spatial computing.

The biggest breakthrough by far is that GPT-4o is “natively multi-modal.” This means it can understand and respond through voice, text, and images, all in one interface.

Why Does This Matter for Spatial Computing?

Instead of typing out context for a problem you want solved, with a multi-modal assistant you can turn on your camera and show the model the problem in real time. Computers — and many other devices that have cameras — can now understand the world with increasing sophistication and levels of interaction.

Products like Rabbit R1 and Humane AI Pin were arguably too early to show the way forward, but both demonstrate the appetite for compute beyond today’s 2D screens and keyboards.

We’re cautiously bullish on smart glasses over the next 2-5 years, but there could be other form factors too. If Meta, with their Ray-Bans, or Google, with their Project Astra AR glasses, can nail a comfortable form factor - glasses - and a practical assistant, the adoption of multimodal assistants will create new user habits around interacting with computers and the world around us.

The AR Glasses Google used during Project Astra Demo (Watch)

Glasses are the optimal form factor for this future because they naturally integrate functionalities which other devices lack. Next-gen Meta Ray-Bans - and the like - will revolutionise UI/UX by understanding user intent from passive signals like location, gaze, and activities, eliminating traditional interfaces. The tech is already very good, they’ll just need to overcome the historic social stigma to wearing new devices like this (no-one wants to be a glass-hole!).

Meanwhile, in regular mixed reality is still on it’s slow, but inevitable march. Google just announced a new partnership with Magic Leap, Meta is opening up its OS to other players and Apple revealed that half of Fortune 100 companies are “using” Apple Vision Pros. While at times the consumer response has been mixed, enterprise success highlights the need to target the right customer segments with the right solution.

In the near term, AR glasses with AI chat capabilities could be something compelling for consumers.

As these technologies converge, and VR headsets become lighter, we can anticipate a future where lightweight AI-powered XR glasses become the norm.

What About Startups?

Whilst startups will need to nimbly navigate the open spaces left by the bigger players, there is still a lot of work to be done — and applications to be built — at the intersection of AI and spatial computing.

As AI models become commoditised and mobile chips improve, we’ll run many models locally on devices. The key to success will then be improvements in spatial intelligence—understanding the environment and context— which is critical for AI to interact more effectively with the real world.

For example, last month we backed Ai-Coustics - a speech enhancement software crucial for audio recognition by AI enhanced devices - and Graswald, a platform that turns real-world objects into 3D models. Modelling the 3D world is essential for building spatial intelligence, allowing AI to effectively understand, navigate, and interact with real-world environments.

Ultimately, OpenAI’s, Meta’s and Google’s emphasis on AI being multimodal (vs simply LLM) underscores that the next paradigm of human interaction with technology will be built on the intersection of Spatial/AR/VR x AI - and we’re excited to continue backing it!

FOV News

We are out and about again! In June you can:

At AWE it will also be great too see, Iara Dias, recipient of our AWE USA DVRSTY Travel Voucher. Iara is working on ikkio - a mobile app designed to help the visually impaired and blind navigate their day-to-day more easily.

The app includes features like object detection, text recognition, text-to-speech, and voice commands, allowing users to understand their surroundings, read text, and interact with their environment, making everyday tasks simpler for the visually impaired and blind community.

FOV Ventures’ Environmental, Social, and Governance (ESG) Report 2023

As a pre-seed and seed fund, embedding ESG principles at the core of our operations and those of our portfolio companies is also foundational to our approach.

The survey includes responses from a total of 23 portfolio companies. Here’s the snapshot of the respondents and the key takeaways.

Upcoming Portfolio Announcements

In June we have a number of exciting additions to the FOV portfolio. To keep up to date as we announce them, follow us on LinkedIn.

Via LinkedIn, we also regularly share, portfolio updates, industry news, and new FOV articles. Don't miss out!

FOV Portfolio News:

FOV Job Board: