gen‑ai.news
← Back
Image

Microsoft readies new MAI voice and image models for Build 2026

Microsoft is reportedly preparing to unveil several new models at its Build 2026 developer conference, all grouped under the MAI branding the company has been building out in recent months. The models include MAI-Image-2.5, a generative image model, MAI-Transcribe-1.5 for speech-to-text tasks, and MAI-Voice-2, which is described as supporting multiple languages.

The MAI line represents Microsoft's effort to develop AI models in-house rather than relying entirely on third-party providers, including its close partner OpenAI. Earlier MAI models were introduced quietly through Azure AI Foundry and Microsoft's API offerings, positioning them as practical tools for enterprise developers rather than consumer-facing products.

A multilingual voice model is particularly notable given the competitive landscape in speech synthesis and real-time translation. Accurate, natural-sounding voice generation across languages remains a difficult problem, and enterprise demand for such capabilities in products like Teams, Copilot, and customer service tooling is significant. MAI-Transcribe-1.5 likewise fits into Microsoft's broader push to improve real-time and asynchronous transcription across its productivity suite.

Build 2026 is shaping up to be a dense event for AI announcements, and the MAI model family will likely be positioned as part of Microsoft's Azure AI platform strategy. Developers using Azure AI Foundry would be a primary audience, with these models potentially available through standard API access shortly after the event. Whether the image model competes directly with offerings like DALL-E or takes a different approach - such as focusing on editing or enterprise document workflows - remains to be seen ahead of the official unveiling.

Enjoy this story? Get the next one in your inbox.

Twice a week: the most important stories in generative image and video AI, distilled into a 2-minute read.

Free. Unsubscribe any time. No spam, ever.

Your next read

No image
Image

Google’s Dreambeans, its weirdest-named AI tool to date, will turn your life into a cartoon

Google has introduced Dreambeans, a tool that pulls personal data from your Google account to generate AI-illustrated stories in a cartoon style. The feature represents a notable step toward using ambient personal data - photos, calendar events, and similar account content - as direct source material for generative image output. It is, by most measures, one of the more unusually named products Google has shipped.

Image

A British MP is suing to see if xAI is legally responsible for the images Grok produces

A British MP has filed a lawsuit against xAI to establish whether the company bears legal responsibility for images generated by its Grok AI system. The case is part of a broader wave of scrutiny that includes investigations in the EU, the UK, and California. At issue is how far platform liability extends when an AI image generator produces harmful or problematic content.