new.website Joins v0 to Accelerate AI-Driven Development
new.website Joins v0 to Accelerate AI-Driven Development
v0 has merged with website builder new.website to accelerate AI-driven software development. new.website provides all-in-one website solutions with built-in forms and SEO tools. The merger combines technical resources to help developers ship production-ready software more efficiently. Financial terms weren’t disclosed, but the engineering teams will be integrated.
Why No 'AlphaFold for Materials'? A Decade of AI Science Lessons
MIT’s Heather Kulik shares a decade of experience in AI for materials science. She notes that while AlphaFold revolutionized protein folding, materials discovery faces unique challenges like sparse data and long experimental cycles. Kulik advocates for materials-specific AI methodologies and emphasizes cross-disciplinary collaboration. The discussion covers recent advances and future directions.
Hugging Face Launches EVA Framework for Voice Agent Evaluation
Hugging Face has released EVA (Evaluation of Voice Agents), a standardized framework for assessing voice assistant performance. It provides metrics for fluency, task accuracy, and response latency. EVA supports comparative testing across multiple voice models and is currently English-focused, with multi-language support planned. The framework is open-sourced under MIT license on GitHub.
OpenAI Releases Prompt-Based Teen Safety Policies
OpenAI has introduced prompt-based safety policies for developers using gpt-oss-safeguard. The policies help AI systems identify and filter age-specific risks like inappropriate content and privacy concerns. The policies are integrated into OpenAI’s API for easy implementation. This is part of OpenAI’s broader teen safety initiative, which includes age-based content classification.
Claude Adds Interview-Style Interaction Feature
Claude has introduced an interview-style interaction feature where the AI asks clarifying questions before providing answers. This mimics human conversational flow to better understand user needs. Tech outlet Ben’s Bites suggests this is part of Claude’s competition strategy against OpenClaw, focusing on more natural interactions. The feature is rolling out gradually with no full launch date announced.
GitHub Builds AI-Powered Issue Triage with Copilot SDK
GitHub tutorial demonstrates using the Copilot SDK to build an AI-powered issue classification system in React Native. The system auto-generates issue summaries with graceful degradation and caching. The modular design supports multiple issue types and integrates GitHub’s native APIs. Code examples are open-sourced for direct implementation. This showcases production patterns for the Copilot SDK.
OpenAI Codex Releases Version 0.117.0-alpha.14
OpenAI Codex has released version 0.117.0-alpha.14, featuring performance optimizations and bug fixes. The changelog remains minimal, following the project’s rapid 1-2 week alpha release cycle. Codex is OpenAI’s code generation model powering GitHub Copilot, supporting multiple programming languages. Developers can access the latest version through the OpenAI API.
AgenticGEO: Self-Evolving System for Search Engine Optimization
arXiv paper introduces AgenticGEO, a self-evolving system for Generative Engine Optimization. Unlike traditional search ranking-based optimization, generative engines focus on content inclusion. The system continuously learns and self-adjusts to optimize content generation strategies. The research demonstrates effectiveness in specific search scenarios but lacks deployment data. It offers a novel approach to generative search optimization.
Meta Superintelligence Labs Hires Dreamer to Advance Personal AI
Meta Superintelligence Labs has hired Dreamer, just 11 days after their Latent Space podcast aired. Dreamer will advance personal superintelligence research at MSL, combining previous work with new resources to potentially break through in AI capabilities and efficiency.
OpenAI Foundation to Invest $1B in Disease Cures and AI Resilience
The OpenAI Foundation plans to invest at least $1 billion in curing diseases, economic opportunity, AI resilience, and community programs. The funding aims to apply AI technology to solve global challenges and ensure AI systems remain safe and reliable.
ProMAS: Proactive Error Forecasting for Multi-Agent Systems
Researchers introduced ProMAS, a method using Markov transition dynamics to proactively forecast errors in multi-agent systems. By analyzing state transitions between agents, it identifies potential failure points in advance, improving system stability. Applicable to high-reliability collaborative tasks like autonomous driving and robotics.
ChatGPT Introduces Immersive Shopping with Agentic Commerce
ChatGPT launched a new shopping experience powered by the Agentic Commerce Protocol, enabling product discovery, side-by-side comparisons, and purchases directly within conversations. The interface is more visual, integrating multiple merchants to provide consumers with a one-stop shopping solution.
Domain-Specific Tree of Thought with Plug-and-Play Predictors
Researchers introduced a new Tree of Thoughts method using plug-and-play predictors for domain-specific reasoning. It solves the trade-off between exploration depth and computational efficiency in traditional ToT frameworks, maintaining reasoning quality while reducing computational costs for complex domain-specific tasks.
FactorSmith: Agentic Simulation via Markov Decomposition
Researchers introduced FactorSmith, a method that generates agent simulations through Markov Decision Process decomposition. It addresses the reasoning limitations of large language models when processing large, interconnected codebases, enabling simulation and testing of complex systems from natural language specifications.
LLM Introspection Reliability Questioned in New Study
New research evaluates LLM introspection capabilities, finding current assessment methods are flawed. The Me, Myself, and $\pi$ benchmark reveals LLMs show inconsistent self-evaluation performance, especially on complex reasoning tasks. The study highlights limitations in existing LLM metacognition mechanisms.
Multi-Agent AI Fails Under Real-World Communication Stress
AgentComm-Bench tests multi-agent AI cooperation under latency, packet loss, and bandwidth constraints. Systems that excel in ideal conditions fail dramatically in simulated real-world scenarios, potentially causing robot teams or autonomous vehicle convoys to collapse in complex environments.