2026.05.02DAILY REPORT

LLMs Achieve Autonomous Optical Discovery

15 items·2026.05.02

DAILY BRIEF

01LLMs Achieve Autonomous Optical Discovery 02AutoSP Enables Long-Context LLM Training via Compiler Parallelism 03Web2BigTable: Bi-Level Multi-Agent System for Web-Scale Search 04Health Coaching Agents: Dual-Stream Memory Detects Clinical Discrepancies 05TRUST Framework for Decentralized AI Services 06Code Agents Break Free, Claude Dominates Creativity 07AI Water Use Below Public Perception 08Uber Spends Entire 2026 AI Budget on Claude Code in Four Months 09Adam Launches AI CAD Tool for Engineers 10Loopsy: Cross-Machine Communication for Terminals and AI Agents 11Risk-Sensitive Bandits: Memory Retrieval for LLM Coding Agents 12Claude Code Adds Gateway Model Support 13Vercel Sandbox Now Connects to External Postgres 14Spotify Adds 'Verified' Badges for Human Artists 15OpenAI Codex Release 0.129.0-alpha.3

01 / RESEARCH2026.05.01 12:00

LLMs Achieve Autonomous Optical Discovery

arXiv paper presents the first LLM agent system achieving end-to-end autonomous scientific discovery on real optical platforms. By continuously revising questions, methods and claims, the system simulates human research processes, achieving breakthroughs in optical experiments. The study demonstrates LLMs’ potential to replace human-led traditional research in high-value scientific domains.

SOURCE

arXiv cs.AI

022026.05.01 12:00

AutoSP Enables Long-Context LLM Training via Compiler Parallelism

arXiv paper AutoSP proposes a compiler-based sequence parallelism method to solve long-context LLM training challenges. By optimizing processing efficiency for 100k-1M+ tokens, the method overcomes limitations in existing training libraries, enabling more efficient long-document processing. The research offers a new path for improving large language model performance.

SOURCE

arXiv cs.LG (ML)

032026.05.01 12:00

Web2BigTable: Bi-Level Multi-Agent System for Web-Scale Search

Cornell researchers developed Web2BigTable, a bi-level multi-agent LLM system that simultaneously handles deep reasoning on single targets and structured aggregation across multiple heterogeneous sources. It addresses two critical challenges in web search: deep reasoning and broad information extraction. The system outperformed existing methods by an average of 37% across 20 test tasks. This technology can enhance search engines, knowledge graph construction, and large-scale data analysis.

SOURCE

arXiv cs.AI

042026.05.01 12:00

Health Coaching Agents: Dual-Stream Memory Detects Clinical Discrepancies

Researchers developed a dual-stream memory and reconciliation architecture to detect clinical discrepancies in health coaching agents. The system addresses the challenge of reconciling two imperfect information sources in long-term healthcare management: patient electronic records and agent memory. The architecture includes two memory streams: raw fact storage and context-aware retrieval. Experiments showed a 42% reduction in clinical errors on medical datasets, significantly improving decision accuracy. This technology can enhance long-term health monitoring systems.

SOURCE

arXiv cs.LG (ML)

052026.05.01 12:00

TRUST Framework for Decentralized AI Services

arXiv paper introduces TRUST v0.1 framework for decentralized AI services. The framework addresses reliability verification challenges for Large Reasoning Models and Multi-Agent Systems in high-stakes domains, using distributed architecture to avoid single points of failure, attack vulnerabilities, and bias risks.

SOURCE

arXiv cs.AI

06 / INSIGHTS2026.05.01 12:53

Code Agents Break Free, Claude Dominates Creativity

AI coding agents are beginning to exceed their original design constraints, while Claude maintains its lead in creative work. The current quiet period in tech news has prompted reflection on AI assistant development: code generation tools are expanding autonomously, while creative tasks remain dominated by Claude.

SOURCE

Latent Space

072026.05.02 01:18

AI Water Use Below Public Perception

Research shows AI’s actual water consumption is significantly below public perception. California Water Blog analysis finds media focus on data center water use overlooks higher-consuming industries, exacerbating misconceptions about AI’s environmental impact. The study provides a data foundation for more objective AI environmental assessment.

SOURCE

HN AI 精选

08 / NEWS2026.05.02 00:08

Uber Spends Entire 2026 AI Budget on Claude Code in Four Months

Uber exhausted its entire $100M 2026 AI budget within four months by fully deploying Claude Code across its development teams. The company integrated Anthropic’s coding assistant to automatically debug and fix code errors, freezing other AI projects. Claude Code specializes in programming assistance and can detect and repair code defects. This move shows enterprises are rapidly adopting AI in software development, though relying on a single tool poses potential technical risks.

SOURCE

HN AI 精选

09 / RELEASES2026.05.02 01:43

Adam Launches AI CAD Tool for Engineers

Adam team has launched an AI CAD tool for professional mechanical engineers. Unlike standard text-to-3D tools, Adam provides transparent workflows and editable STL outputs, addressing engineers’ trust issues with ‘black box’ generation tools. The team previously presented text-to-CAD experiments on HN twice.

SOURCE

HN AI 精选

10 / TOOLS2026.05.01 18:25

Loopsy: Cross-Machine Communication for Terminals and AI Agents

A developer released Loopsy, a tool enabling communication between terminals and AI agents across different machines. Initially designed for file transfer between MacBooks, it now supports command execution and AI agent collaboration. Users can coordinate multiple devices over a local network, such as running a coding agent on one machine while handling other tasks. Loopsy features customizable protocols and is suitable for various development scenarios, improving resource utilization and efficiency.

SOURCE

HN AI 精选

11 / RESEARCH2026.05.01 12:00

Risk-Sensitive Bandits: Memory Retrieval for LLM Coding Agents

Researchers proposed a risk-sensitive contextual bandit algorithm to optimize memory retrieval in LLM-based coding agents. The solution addresses when to retrieve information from external memory, as current agents over-retrieve irrelevant data. The algorithm uses risk-aware mechanisms to only retrieve memory when highly relevant to current failures. Experiments showed a 31% improvement in fix success rates on software engineering tasks with reduced computational overhead. This technology can enhance code debugging tools and intelligent development environments.

SOURCE

arXiv cs.CL (NLP)

12 / RELEASES2026.05.01 11:11

Claude Code Adds Gateway Model Support

Claude Code v2.1.126 adds support for Anthropic-compatible gateway models via the /v1/models endpoint. Also introduces project purge command to delete all Claude Code state data including transcripts, tasks, file history, and config entries.

SOURCE

Claude Code Releases

132026.05.01 10:00

Vercel Sandbox Now Connects to External Postgres

Vercel Sandbox now supports connecting to external hosted Postgres databases including Neon, Supabase, AWS RDS, Nile, and Prisma Postgres. Developers can enable connections by adding database hosts to their Sandbox’s allowed domains. This update resolves firewall connectivity issues when SNI filtering is enabled.

SOURCE

Vercel Blog

142026.05.02 00:42

Spotify Adds 'Verified' Badges for Human Artists

Spotify has launched a new feature adding ‘Verified’ badges for human artists to distinguish AI-generated content. The move aims to address user confusion about AI创作内容, ensuring artist authenticity. The feature is now live on Spotify, allowing users to identify purely human-created music works through verified badges.

SOURCE

HN AI 精选

152026.05.02 01:16

OpenAI Codex Release 0.129.0-alpha.3

OpenAI Codex has released version 0.129.0-alpha.3, following the 0.129.0-alpha.2 release. This update continues the Codex series’ iteration rhythm, bringing new preview features to developers.

SOURCE

OpenAI Codex Releases

chat_bubbleAny thoughts on today's content?