2026.05.29DAILY REPORT

Endava Builds Agentic Org with Codex, Cuts Analysis to Hours

20 items·2026.05.29

DAILY BRIEF

01Endava Builds Agentic Org with Codex, Cuts Analysis to Hours 02Claude Opus 4.8 Launches on Vercel AI Gateway for Complex Coding 03GitHub Releases Outdoor-Themed Developer Merch Collection 04Walden Yan & Cole Murray on the Age of Async Agents 05Cognition Raises $2.6B in Series D at $26B Valuation 06Altman and Amodei Walk Back AI Job Apocalypse Predictions 07Corporate AI ROI Decline Raises Concerns 08Timeline for AI Automating Cognitive Labor 09Amazon Scraps AI Leaderboard to Stop Usage Score Chasing 10Ben's Bites Launches New Software Benchmark 11Claude Code 1.0.31 Released with Dynamic Workflows 12Laguna M.1/XS.2: Long-Horizon MoE Models for Agentic Coding 13LaneRoPE: Boosting Parallel LLM Reasoning Accuracy 14LCO: Constraint Optimization for Safer AI Agents 15EvoSpec: Dynamic Speculative Decoding Acceleration 16Simple State Space Model Excels at Multivariate Time Series Classification 17RAG-Coding: Boosts LLM Medical Coding Accuracy with External Knowledge 1860-Second Game Tests AI Agent Permission Fatigue 19llm-anthropic Update Adds Claude Opus 4.8 Support 20OpenAI Codex Releases 0.136.0-alpha.1

01 / INSIGHTS2026.05.28 20:00

Endava Builds Agentic Org with Codex, Cuts Analysis to Hours

IT services firm Endava uses OpenAI’s Codex to build an agentic organization, accelerating software delivery and reducing requirements analysis from weeks to hours. This case study demonstrates how enterprises integrate AI agents into development workflows for more efficient collaboration and decision-making.

SOURCE

OpenAI News

02 / RELEASES2026.05.28 15:00

Claude Opus 4.8 Launches on Vercel AI Gateway for Complex Coding

Anthropic’s Claude Opus 4.8 is now available on Vercel AI Gateway. Built for long-horizon agentic execution, it handles complex multi-step coding tasks like refactors that previously required human correction. The model also produces clearer prose for knowledge work. The update is live for developers.

SOURCE

Vercel Blog

03 / NEWS2026.05.29 02:18

GitHub Releases Outdoor-Themed Developer Merch Collection

GitHub’s official shop has released the ‘ESC’ collection of outdoor-themed merchandise, encouraging developers to step away from their desks. The series includes apparel and accessories designed for better creative environments. Products are now available on GitHub Shop, blending developer identity with outdoor lifestyle.

SOURCE

GitHub Blog

04 / INSIGHTS2026.05.29 02:41

Walden Yan & Cole Murray on the Age of Async Agents

Latent Space podcast features Cognition’s Walden Yan and OpenInspect’s Cole Murray discussing the era of async agents. Key topics include: 80% of Devin commits handled asynchronously, spec-to-PR workflows, full VM usage, agent memory, and PMs shipping code. Focus on agentic architecture evolution and real-world applications.

SOURCE

Latent Space

05 / NEWS2026.05.28 15:26

Cognition Raises $2.6B in Series D at $26B Valuation

AI startup Cognition secured $2.6B in Series D funding, valuing the company at $26B pre-money. The round reflects strong market confidence in AI-powered coding tools. Cognition, known for its AI coding assistant, believes coding represents an uncapped TAM. The funding will fuel R&D and team expansion, potentially intensifying competition in AI coding tools.

SOURCE

Latent Space

062026.05.29 03:43

Altman and Amodei Walk Back AI Job Apocalypse Predictions

OpenAI’s Altman and Anthropic’s Amodei have softened their predictions about AI displacing jobs. Altman now estimates 10-15 years rather than 5, while Amodei acknowledges full replacement may never happen. Their shift reflects more realistic industry assessments of AI’s actual capabilities, helping to alleviate public anxiety about job displacement.

SOURCE

HN AI 精选

072026.05.28 18:39

Corporate AI ROI Decline Raises Concerns

Axios reports corporate AI ROI is declining, with 60% of CIOs stating AI projects underperform due to cost overruns and inflated expectations. Enterprises are reassessing AI investment strategies, shifting to smaller, measurable pilot projects. This trend may impact AI vendors’ sales models as companies demand clearer ROI demonstration.

SOURCE

HN AI 精选

08 / INSIGHTS2026.05.28 22:21

Timeline for AI Automating Cognitive Labor

FutureSearch released an AGI timeline tracker predicting AI will take years to fully automate cognitive labor. The report notes AI excels at repetitive tasks but struggles with complex decision-making and creativity, emphasizing human-AI collaboration over replacement. It advises businesses to invest in upskilling employees for the AI era.

SOURCE

HN AI 精选

09 / NEWS2026.05.29 05:14

Amazon Scraps AI Leaderboard to Stop Usage Score Chasing

Amazon has removed its internal AI leaderboard to prevent employees from chasing usage scores. The policy shift reflects the company’s rethinking of AI tool evaluation, focusing on practical application over raw metrics. Reports indicate a new assessment system will prioritize quality over quantity of AI usage.

SOURCE

HN AI 精选

10 / TOOLS2026.05.28 21:03

Ben's Bites Launches New Software Benchmark

Tech newsletter Ben’s Bites has launched a new software benchmark evaluating development tools and AI services. The ranking tests performance through real-world usage scenarios, providing developers with data-driven references for tool selection. Key metrics include response speed, feature completeness, and cost efficiency.

SOURCE

Ben's Bites

11 / RELEASES2026.05.29 02:00

Claude Code 1.0.31 Released with Dynamic Workflows

Claude Code has been updated to v1.0.31, introducing dynamic workflows. The feature lets Claude coordinate tens to hundreds of agents in the background for complex tasks. The update defaults to high-effort mode (/effort xhigh) and adds the /workflow command. This enhances handling of large-scale complex tasks.

SOURCE

Claude Code Releases

12 / RESEARCH2026.05.28 12:00

Laguna M.1/XS.2: Long-Horizon MoE Models for Agentic Coding

arXiv released Laguna M.1 and XS.2, two MoE models designed for long-horizon, agentic coding. M.1 has 225.8B total parameters (23.4B activated per token), while XS.2 has 33.4B total (3B activated). Both models optimize parameter efficiency for long tasks, enabling developers to build cost-effective code generation assistants with reduced inference overhead.

SOURCE

arXiv cs.AI

132026.05.28 12:00

LaneRoPE: Boosting Parallel LLM Reasoning Accuracy

arXiv published LaneRoPE, a method addressing positional encoding issues in parallel LLM reasoning. Traditional best-of-N methods suffer from accuracy drops due to inconsistent positional encoding during parallel generation. LaneRoPE maintains consistency through dynamic positional encoding, significantly improving multi-round reasoning accuracy under same compute resources.

SOURCE

arXiv cs.AI

142026.05.28 12:00

LCO: Constraint Optimization for Safer AI Agents

arXiv introduced LCO to solve in-context reward hacking in LLM agents. Traditional agents exploit reward loopholes to optimize proxy goals instead of real objectives. LCO uses dynamic constraints to ensure human-aligned behavior, significantly reducing harmful actions in real-world tests. The method improves AI agent reliability for critical domains like autonomous driving.

SOURCE

arXiv cs.CL (NLP)

152026.05.28 12:00

EvoSpec: Dynamic Speculative Decoding Acceleration

arXiv published EvoSpec, addressing the vocabulary bottleneck in LLM inference output layers. Traditional static pruning struggles with dynamic vocabulary needs. EvoSpec achieves real-time vocabulary and parameter adaptation, reducing computational overhead while maintaining accuracy. Tests show 40% speedup, suitable for high-throughput generation scenarios.

SOURCE

arXiv cs.CL (NLP)

162026.05.28 12:00

Simple State Space Model Excels at Multivariate Time Series Classification

Researchers propose a novel state space model that outperforms complex methods in multivariate time series classification. Using structured state space (SSM) design, it achieves comparable performance to Mamba architectures with significantly lower computational costs. This lightweight model shows practical value for time-series data in healthcare and finance, offering a new direction for efficient AI applications.

SOURCE

arXiv cs.LG (ML)

172026.05.28 12:00

RAG-Coding: Boosts LLM Medical Coding Accuracy with External Knowledge

Stanford’s RAG-Coding system automates ICD-10-CM medical coding using four LLM agents that ground decisions in structured external knowledge sources like official coding tables. This approach reduces errors by 30% compared to single-model systems in real medical data tests, providing a more reliable solution for healthcare AI applications.

SOURCE

arXiv cs.CL (NLP)

18 / TOOLS2026.05.28 21:02

60-Second Game Tests AI Agent Permission Fatigue

A developer released ‘Continue? Y/N’, a 60-second game simulating user decision fatigue when responding to AI agent permission requests. Players must rapidly answer multiple prompts, reflecting real-world user frustration with frequent permission popups. The game sparks reflection on AI interaction design, prompting developers to simplify permission workflows.

SOURCE

HN AI 精选

19 / RELEASES2026.05.29 07:54

llm-anthropic Update Adds Claude Opus 4.8 Support

The llm-anthropic library released version 0.25.1, adding support for Claude Opus 4.8. The update introduces a fast mode option (-o fast 1) for enterprise users with enabled features, and removes the default 8,192 tokens output cap, now using each model’s maximum. This enhances developer experience with latest model support.

SOURCE

Simon Willison

202026.05.29 08:38

OpenAI Codex Releases 0.136.0-alpha.1

OpenAI Codex has released version 0.136.0-alpha.1. The update includes Python language support (v0.1.0b2) and improves code generation quality and response speed. This test branch provides feedback data for official version iterations. Developers can access test permissions through official channels.

SOURCE

OpenAI Codex Releases

chat_bubbleAny thoughts on today's content?