New study proposes motivational architecture for conversational AGI
New study proposes motivational architecture for conversational AGI
arXiv paper ‘Motivational Architecture for Conversational AGI’ argues traditional cognitive AI motivation design fails for conversational agents. It proposes rethinking motivation frameworks for linguistic interaction loops where the environment is a user’s evolving mental state.
Graph-guided ultra-low-bit quantization reduces LLM hidden costs
arXiv paper proposes graph-guided ultra-low-bit quantization to solve hidden scaling overhead in LLM post-training quantization. Traditional methods introduce extra costs via rigid weight assumptions. The new approach optimizes quantization using graph structures, reducing hidden costs in 2-4 bit quantization while maintaining accuracy.
Action-state communication boosts multi-agent efficiency
arXiv paper ‘What Should Agents Say?’ argues traditional multi-agent systems’ unconstrained natural language communication is inefficient. It proposes an action-state communication framework to structure agent interactions, enabling more efficient collaboration and reducing semantic misunderstandings.
LLM Judges Easily Manipulated: Post-Decision Interaction Skews Results
New research reveals a critical flaw in LLM-as-judge evaluation systems. Contrary to the assumption that judgments are stable, experiments show that post-decision interaction with the model can significantly alter evaluation results. This ‘post-decision interaction attack’ can lead to skewed benchmarks and unreliable assessments. The team demonstrates this vulnerability and suggests defensive measures. Developers relying on LLM evaluations should be aware of this risk.
GITCO Fixes Time Series Model Forecasting Bias
Researchers have developed GITCO to address forecasting bias in Time Series Foundation Models (TSFMs). The technique solves ‘context poisoning,’ where anomalous patches disproportionately attract model attention, degrading zero-shot forecasts. By using gated mechanisms at inference time to filter anomalous context, GITCO significantly improves prediction accuracy. Experiments show this method enhances TSFMs’ robustness in noisy data environments.
AI Fails Professional Tests: Evaluation Gap Revealed
Despite strong benchmark results, recent research shows AI systems haven’t achieved economically meaningful deployment in professional fields. The study identifies a critical flaw in current evaluation standards—tests fail to reflect AI’s capabilities in real-world work environments. The paper proposes the ‘Last Exam’ concept, advocating for more realistic evaluation frameworks that mirror actual professional scenarios.
Meta confirms Instagram hack via AI chatbot漏洞
Meta confirmed thousands of Instagram accounts were hacked due to an AI chatbot vulnerability. Attackers bypassed security by manipulating user interactions with the bot. Meta has patched the flaw and mandated password resets for affected users.
UK police banned from using AI in court statements
UK police ordered to immediately halt AI tool usage in court statements. The directive stems from concerns about AI-generated evidence accuracy and legal liability, potentially impacting ongoing cases. The move reflects judicial system’s heightened vigilance on AI risks.
US House draft bill aims to preempt state AI regulations
US House lawmakers introduced a draft bill to preempt state AI regulations, centralizing AI oversight under federal government. The controversial move faces criticism for limiting state autonomy. If passed, it would be America’s first federal AI law, with expected vote next year.
Meta delays new AI model developer release again
Meta has again delayed the release of its new AI model to developers, with no new timeline set. Originally slated for early this year, the model has been postponed multiple times. Delays may stem from technical challenges or competitive pressure from OpenAI’s GPT-5 progress, raising concerns about Meta’s AI execution.
HN Programmers Collective Backlash Against AI Code Quality
Over the past six months, Hacker News has seen daily posts from programmers criticizing AI for ‘writing bad code,’ ‘introducing bugs,’ and ‘creating technical debt.’ This collective backlash isn’t isolated—it reflects widespread dissatisfaction in the developer community with current AI coding tools. Despite AI’s speed in generating code, concerns about code quality and maintainability in real projects are growing.
micropython-wasm 0.1a2 adds CLI tool
Developer Simon Willison released micropython-wasm 0.1a2, adding a CLI tool for running Python code in-browser WebAssembly sandbox. The alpha package enables secure MicroPython execution in browsers, suitable for code demos and online experiments.
MicroPython+WASM enables browser Python sandboxing
Developer Simon Willison released micropython-wasm, enabling secure Python execution in-browser WebAssembly sandbox. Combining WASM security isolation with MicroPython’s lightweight interpreter, the tool allows Python code to run without backend servers for education and demos.
AI Market Stable as RSI Shows Balanced Trade
The AI market remained calm today as the RSI (Relative Strength Index) demonstrated stable performance without major fluctuations or breakthroughs. As a technical indicator measuring market buying and selling power, the RSI is currently near the 50 midline, showing balanced market forces and cautious investor sentiment. Data shows the RSI fluctuated between 48-52 today, a significant narrowing from the 45-55 range of previous days, with trading activity decreasing by approximately 15%. Analysts suggest this quiet state may be building energy for future technical breakthroughs, advising investors to closely monitor relevant developments.