AI Native Weekly Newsletter: 06 June 2025
Contents
- Gemini 2.5 Pro Preview: Enhanced AI with Cost Control
- Codex in ChatGPT: Internet Access, Voice Dictation & More
- Luma AI Modify Video: Transform Scenes While Preserving Motion
- ElevenLabs v3: Revolutionary Emotional AI Speech (80% Off)
- FLUX.1 Kontext: 8x Faster AI Image Generation & Editing
- Opera Neon: First AI Browser That Automates Web Tasks
- Cursor 1.0 Launches: BugBot, Memory & Background Agent
- OpenAudio Releases S1: 4B-Parameter TTS Model Ranked #1
Gemini 2.5 Pro Preview: Enhanced AI with Cost Control
Google releases upgraded Gemini 2.5 Pro preview with significant benchmark improvements – 24-point Elo jump on LMArena (maintaining #1 at 1470) and 35-point leap on WebDevArena (leading at 1443). Features enhanced creativity, better formatting, and new thinking budgets for developer cost/latency control. Available in Google AI Studio, Vertex AI, and Gemini app before general release.
Codex in ChatGPT: Internet Access, Voice Dictation & More
Codex is now available to Plus users with generous initial usage limits. Key updates include internet access during task execution (off by default, available to Plus/Pro/Team users), voice dictation for tasks, and the ability to update existing PRs when following up. Organizations using SSO no longer need separate MFA setup. Enterprise internet access coming soon.
Luma AI Modify Video: Transform Scenes While Preserving Motion
Luma AI launches Modify Video in Dream Machine Ray 2, allowing creators to transform environments, lighting, and textures while preserving character performance and motion. Features include motion capture, world swapping, and isolated element editing with three preset modes. Blind evaluations show superior performance over Runway V2V in motion retention and consistency. Available now with 10-second max duration – transform any shot without starting over.
ElevenLabs v3: Revolutionary Emotional AI Speech (80% Off)
ElevenLabs launches v3 alpha with groundbreaking emotional audio tags (laughter, excitement, anger), multi-speaker dialogue mode, and 70+ language support. Features include controllable prosody, immersive soundscapes, and natural conversations. 80% discount for UI users until June 2025, with API access coming soon for enterprise early adopters.
FLUX.1 Kontext: 8x Faster AI Image Generation & Editing
Black Forest Labs unveils FLUX.1 Kontext, a breakthrough multimodal AI that unifies text-to-image generation with in-context editing capabilities. The model delivers 8x faster inference than leading competitors while maintaining state-of-the-art quality, character consistency, and typography. Available in pro and max variants through major platforms including KreaAI, Freepik, Replicate, and FAL, with an open-weight dev version in private beta for research applications.
Opera Neon: First AI Browser That Automates Web Tasks
Opera unveils Neon, the first agentic browser that transforms user intent into action through AI agents. Users can chat with integrated AI, automate routine web tasks like bookings and forms, and create games, websites, or code with cloud-powered AI engines. The premium browser features local task processing for privacy and enables true multitasking. Early adopters can join the waitlist today at operaneon.com.
Cursor 1.0 Launches: BugBot, Memory & Background Agent
Cursor has released version 1.0, introducing BugBot for automated code review in GitHub PRs, universal Background Agent access for parallel task execution, and project-specific Memories. The update adds Jupyter Notebook support, one-click MCP setup with OAuth, and enhanced visualization capabilities. These advances transform Cursor from a code editor into a complete AI development platform that streamlines coding workflows through intelligent automation.
OpenAudio Releases S1: 4B-Parameter TTS Model Ranked #1
OpenAudio unveils S1, a state-of-the-art text-to-speech model trained on 2M+ hours of audio that achieves #1 ranking on HuggingFace TTS-Arena. The model offers voice actor-like control with emotional/tone markers, supports 13 languages, and costs just $15/million bytes. Available in two variants: flagship S1 (4B parameters) and efficient S1-mini (0.5B parameters). Features industry-best WER of 0.008 and sophisticated instruction-following capabilities for natural, expressive speech synthesis.