HOT

AI Research Papers

Latest papers from arXiv — updated daily across cs.AI, cs.LG, cs.CL, and cs.CV.

200 papers — updated daily from arXiv

LLMs & LanguageMay 20, 20262605.19010
AgentNLQ: A General-Purpose Agent for Natural Language to SQL

Olena Bogdanov, Yeunji Jung, Chandra Dhir +5 more

arXiv:2605.19010v1 Announce Type: new Abstract: Natural language to SQL (NL2SQL) conversion is an important problem for researchers and enterprises due to the ubiquitous importance of relational databases in…

AbstractPDF
Vision & MultimodalMay 20, 20262605.19042
Interference-Aware Multi-Task Unlearning

Ying-Hua Huang, Rui Fang, Hsi-Wen Chen +1 more

arXiv:2605.19042v1 Announce Type: new Abstract: Machine unlearning aims to remove the contribution of designated training data from a trained model while preserving performance on the remaining data. Existing work…

AbstractPDF
LLMs & LanguageMay 20, 20262605.19156
How Far Are We From True Auto-Research?

Zhengxin Zhang, Ning Wang, Sainyam Galhotra +1 more

arXiv:2605.19156v1 Announce Type: new Abstract: Recent auto-research systems can produce complete papers, but feasibility is not the same as quality, and the field still lacks a systematic study of how good…

AbstractPDF
Agents & ReasoningMay 20, 20262605.19192
Hallucination as Exploit: Evidence-Carrying Multimodal Agents

Guijia Zhang, Hao Zheng, Harry Yang

arXiv:2605.19192v1 Announce Type: new Abstract: Multimodal agents use screenshots, documents, and webpages to choose tool calls. When a false visual claim triggers a click, email, extraction, or transfer, hallucination …

AbstractPDF
LLMs & LanguageMay 20, 20262605.19337
Agentic Trading: When LLM Agents Meet Financial Markets

Yihan Xia, Panpan You, Taotao Wang +4 more

arXiv:2605.19337v1 Announce Type: new Abstract: A growing body of work explores how Large Language Models (LLMs) can be embedded in trading systems as agents that perceive market information, retrieve context, reason…

AbstractPDF
Agents & ReasoningMay 20, 20262605.19376
Generative Recursive Reasoning

Junyeob Baek, Mingyu Jo, Minsu Kim +3 more

arXiv:2605.19376v1 Announce Type: new Abstract: How should future neural reasoning systems implement extended computation? Recursive Reasoning Models (RRMs) offer a promising alternative to autoregressive sequence…

AbstractPDF
LLMs & LanguageMay 20, 20262605.19382
PRISM: A Benchmark for Programmatic Spatial-Temporal Reasoning

Qiran Zhang, Yuheng Wang, Runde Yang +9 more

arXiv:2605.19382v1 Announce Type: new Abstract: Programmatic video generation through code offers geometric precision and temporal coherence beyond pixel-level diffusion models, yet rigorously evaluating whether…

AbstractPDF
RL & OptimizationMay 20, 20262605.19457
Generative Auto-Bidding with Unified Modeling and Exploration

Mingming Zhang, Feiqing Zhuang, Na Li +7 more

arXiv:2605.19457v1 Announce Type: new Abstract: Automated bidding is central to modern digital advertising. Early rule-based methods lacked adaptability, while subsequent Reinforcement Learning approaches modeled…

AbstractPDF
Agents & ReasoningMay 20, 20262605.19461
Beyond Mode Collapse: Distribution Matching for Diverse Reasoning

Xiaozhe Li, Yang Li, Xinyu Fang +10 more

arXiv:2605.19461v1 Announce Type: new Abstract: On-policy reinforcement learning methods like GRPO suffer from mode collapse: they exhibit reduced solution diversity, concentrating probability mass on a single solution …

AbstractPDF
LLMs & LanguageMay 20, 20262605.19518
BLINKG: A Benchmark for LLM-Integrated Knowledge Graph Generation

Carla Castedo, Enrique Iglesias, Manuel Lama +3 more

arXiv:2605.19518v1 Announce Type: new Abstract: Generating Knowledge Graphs (KGs) remains one of the most time-consuming and labor-intensive tasks for knowledge engineers, as they need to identify semantic equivalences …

AbstractPDF
General MLMay 20, 20262605.19521
Efficient Elicitation of Collective Disagreements

Mohamed Ouaguenouni, Felipe Garrido-Lucero, Umberto Grandi +2 more

arXiv:2605.19521v1 Announce Type: new Abstract: We analyze the structure of the disagreement among a population of voters over a set of alternatives. Surveys typically ask either for pairwise comparisons, simple and…

AbstractPDF
General MLMay 20, 20262605.19671
Transforming Constraint Programs to Input for Local Search

Jo Devriendt, Patrick De Causmaecker, Marc Denecker

arXiv:2605.19671v1 Announce Type: new Abstract: Applying local search algorithms to combinatorial optimization problems is not an easy feat. Typically, human intervention is required to compile the constraints to input …

AbstractPDF
LLMs & LanguageMay 20, 20262605.19748
Memory-Augmented Reinforcement Learning Agent for CAD Generation

Yin Xiaolong, Liu Yu, Shen Jiahang +4 more

arXiv:2605.19748v1 Announce Type: new Abstract: Automatic generation of computer-aided design (CAD) models is a core technology for enabling intelligence in advanced manufacturing. Existing generation methods based on…

AbstractPDF
General MLMay 20, 20262605.19758
CogScale: Scalable Benchmark for Sequence Processing

Yannis Bendi-Ouis (Mnemosyne), Romain de Coudenhove (ENS-PSL), Xavier Hinaut (Mnemosyne)

arXiv:2605.19758v1 Announce Type: new Abstract: The ability to maintain and manipulate information over time is a fundamental aspect of living beings and Artificial Intelligence. While modern models have achieved…

AbstractPDF
RL & OptimizationMay 20, 20262605.19768
Minimax Optimal Variance-Aware Regret Bounds for Multinomial Logistic MDPs

Pierre Boudart (SIERRA), Pierre Gaillard (Thoth), Alessandro Rudi (PSL +2 more

arXiv:2605.19768v1 Announce Type: new Abstract: We study reinforcement learning for episodic Markov Decision Processes (MDPs) whose transitions are modelled by a multinomial logistic (MNL) model. Existing algorithms…

AbstractPDF
Agents & ReasoningMay 20, 20262605.19769
OpenComputer: Verifiable Software Worlds for Computer-Use Agents

Jinbiao Wei, Qianran Ma, Yilun Zhao +4 more

arXiv:2605.19769v1 Announce Type: new Abstract: We present OpenComputer, a verifier-grounded framework for constructing verifiable software worlds for computer-use agents. OpenComputer integrates four components: (1)…

AbstractPDF
General MLMay 20, 20262605.19781
From SGD to Muon: Adaptive Optimization via Schatten-p Norms

Thomas Massena (IRIT, DTIPG - SNCF, UT3) +2 more

arXiv:2605.19781v1 Announce Type: new Abstract: Modern optimizers, like Muon, impose matrix-wise geometry constraints on their updates. These matrix-wise constraints can be unified under Linear Minimization Oracle…

AbstractPDF
LLMs & LanguageMay 20, 20262605.19943
Probabilistic Tiny Recursive Model

Amin Sghaier, Ali Parviz, Alexia Jolicoeur-Martineau

arXiv:2605.19943v1 Announce Type: new Abstract: Tiny Recursive Models (TRM) solve complex reasoning tasks with a fraction of the parameters of modern large language models (LLMs) by iteratively refining a latent state…

AbstractPDF
General MLMay 20, 20262605.20098
Neurosymbolic Learning for Inference-Time Argumentation

Gabriel Freedman, Adam Dejl, Adam Gould +4 more

arXiv:2605.20098v1 Announce Type: new Abstract: Claim verification is an important problem in high-stakes settings, including health and finance. When information underpinning claims is incomplete or conflicting,…

AbstractPDF
Agents & ReasoningMay 20, 20262605.18760
DOTRAG: Retrieval-Time Reasoning Along Paths

Larnell Moore, Naihao Deng, Rada Mihalcea +1 more

arXiv:2605.18760v1 Announce Type: cross Abstract: Graph Retrieval-Augmented Generation (GraphRAG) is dominated by a retrieve-then-reason paradigm, where context is retrieved using heuristics and then reasoned over.…

AbstractPDF
LLMs & LanguageMay 20, 20262605.18781
Can LLMs Emulate Human Belief Dynamics?

Adiba Mahbub Proma, Neeley Pate, James N. Druckman +3 more

arXiv:2605.18781v1 Announce Type: cross Abstract: Can LLMs simulate how humans form and change beliefs in social networks? We put this to the test by replicating an established study on belief dynamics, evaluating 12…

AbstractPDF
LLMs & LanguageMay 20, 20262605.18789
Features have life history. And we should care

Philipp Stecher, Sandro Radovanovi\'c, Vlasta Sikimi\'c +1 more

arXiv:2605.18789v1 Announce Type: cross Abstract: Features in language models have life history: they emerge, persist, and die during training, yet the importance of that history remains largely unexplored. We find…

AbstractPDF
LLMs & LanguageMay 20, 20262605.18797
Simply Stabilizing the Loop via Fully Looped Transformer

Rao Fu, Zixuan Yang, Jiankun Zhang +4 more

arXiv:2605.18797v1 Announce Type: cross Abstract: Scaling model performance typically requires increasing model size. Looped Transformer offers a compelling alternative by iteratively reusing the same Transformer…

AbstractPDF
LLMs & LanguageMay 20, 20262605.18800
Theory-optimal Quantization Based on Flatness

Xiusheng Huang, Zhe Li, Xuanwu Yin +5 more

arXiv:2605.18800v1 Announce Type: cross Abstract: Post-training quantization has emerged as a widely adopted technique for compressing and accelerating the inference of Large Language Models (LLMs). The primary…

AbstractPDF
Agents & ReasoningMay 20, 20262605.18803
PROWL: Prioritized Regret-Driven Optimization for World Model Learning

Ahmet H. G\"uzel, Jenny Seidenschwarz, Benjamin Graham +4 more

arXiv:2605.18803v1 Announce Type: cross Abstract: Modern action-conditioned video world models achieve strong short-horizon visual realism, yet remain unreliable on rare, interaction-critical transitions that dominate…

AbstractPDF
LLMs & LanguageMay 20, 20262605.18807
Block-Based Double Decoders

Asher Labovich, Benjamin Bradley, Vanessa Alexander +1 more

arXiv:2605.18807v1 Announce Type: cross Abstract: Encoder-decoder models offer substantial inference-time savings over decoder-only models, but their pretraining objectives suffer from sparse supervision and dynamic…

AbstractPDF
Agents & ReasoningMay 20, 20262605.18809
Metric-Gradient Projection for Stable Multi-Agent Policy Learning

Zuyuan Zhang, Sizhe Tang, Mahdi Imani +1 more

arXiv:2605.18809v1 Announce Type: cross Abstract: General-sum multi-agent learning is often governed by a stacked update field in which each agent's policy update changes the optimization landscape faced by the others. …

AbstractPDF
LLMs & LanguageMay 20, 20262605.18813
Composition of Memory Experts for Diffusion World Models

Sebastian Stapf, Pablo Acuaviva Huertos, Aram Davtyan +1 more

arXiv:2605.18813v1 Announce Type: cross Abstract: World models aim to predict plausible futures consistent with past observations, a capability central to planning and decision-making in reinforcement learning. Yet,…

AbstractPDF
LLMs & LanguageMay 20, 20262605.18826
The Routing and Filtering Structure of Attention

Shafayeth Jamil, Rehan Kapadia

arXiv:2605.18826v1 Announce Type: cross Abstract: The attention interaction matrix $QK^{\top}$ contains two entangled computations: a skew-symmetric component that redistributes information between positions (routing)…

AbstractPDF
LLMs & LanguageMay 20, 20262605.18848
Exact Linear Attention

Weinuo Ou

arXiv:2605.18848v1 Announce Type: cross Abstract: This paper introduces Exact Linear Attention (ELA), a mechanism that achieves linear computational complexity for Transformer attention by leveraging the exact…

AbstractPDF
LLMs & LanguageMay 20, 20262605.18864
SAGE: Shaping Anchors for Guided Exploration in RLVR of LLMs

Chanuk Lee, Minki Kang, Sung Ju Hwang

arXiv:2605.18864v1 Announce Type: cross Abstract: Recent studies observe that reinforcement learning with verifiable rewards (RLVR) reliably improves pass@1 on reasoning tasks, yet often fails to yield comparable gains …

AbstractPDF
LLMs & LanguageMay 20, 20262605.18869
MO-CAPO: Multi-Objective Cost-Aware Prompt Optimization

Jan B\"ussing, Moritz Schlager, Timo Hei{\ss} +2 more

arXiv:2605.18869v1 Announce Type: cross Abstract: Large language models (LLMs) achieve strong performance across a wide range of tasks but are highly sensitive to prompt design, motivating the need for automatic prompt …

AbstractPDF
General MLMay 20, 20262605.18889
Soft Learning

Mohammed Aledhari, Ali Aledhari, Fatimah Aledhari +1 more

arXiv:2605.18889v1 Announce Type: cross Abstract: Modern machine learning forces practitioners to choose between powerful but expensive deep networks and fast but limited classical algorithms. Here we introduce Soft…

AbstractPDF
LLMs & LanguageMay 20, 20262605.18904
Dynamic Model Merging Made Slim

Guodong Du, Wanyu Lin

arXiv:2605.18904v1 Announce Type: cross Abstract: Model merging enables the reuse of fine-tuned models without joint training or access to original data. Dynamic merging further improves flexibility by selectively…

AbstractPDF
General MLMay 20, 20262605.18907
Lightweight and Fast Backdoor Model Detection

Yinbo Yu, Jing Fang, Xuewen Zhang +4 more

arXiv:2605.18907v1 Announce Type: cross Abstract: Deep neural networks (DNN), despite their remarkable performance, are highly vulnerable to backdoor attacks. Existing defenses mainly rely on activation anomaly…

AbstractPDF
General MLMay 20, 20262605.18974
Harnessing Self-Supervised Features for Art Classification

Federico Melis, Davide Bilardello, Emanuele Prato +2 more

arXiv:2605.18974v1 Announce Type: cross Abstract: Classifying artworks presents a significant challenge due to the complex interplay of fine-grained details and abstract features that condition the style or genre of an …

AbstractPDF
Agents & ReasoningMay 20, 20262605.18991
Agent Security is a Systems Problem

Mihai Christodorescu, Earlence Fernandes, Ashish Hooda +11 more

arXiv:2605.18991v1 Announce Type: cross Abstract: We take the position that agent security must be approached as a systems problem: the AI model powering the agent must be treated as an untrusted component, and…

AbstractPDF
LLMs & LanguageMay 20, 20262605.18993
Distilling Linearized Behavior for Effective Task Arithmetic

Thomas Sommariva, Francesca Morandi, Simone Calderara +1 more

arXiv:2605.18993v1 Announce Type: cross Abstract: Task vector composition has emerged as a promising paradigm for editing pre-trained models, enabling model merging through addition and unlearning through subtraction.…

AbstractPDF
Efficiency & SystemsMay 20, 20262605.19049
KVBuffer: IO-aware Serving for Linear Attention

Longwei Zou, Lin Zhong

arXiv:2605.19049v1 Announce Type: cross Abstract: Linear attention has recently gained significant attention for long-context inference due to its constant decoding cost with respect to context length. However,…

AbstractPDF
General MLMay 20, 20262605.19073
Riemannian Networks over Full-Rank Correlation Matrices

Ziheng Chen, Xiaojun Wu, Bernhard Sch\"olkopf +1 more

arXiv:2605.19073v1 Announce Type: cross Abstract: Representations on the Symmetric Positive Definite (SPD) manifold have garnered significant attention across different applications. In contrast, the manifold of…

AbstractPDF
LLMs & LanguageMay 20, 20262605.19141
GRASP: Deterministic argument ranking in interaction graphs

Diganta Misra, Antonio Orvieto, Rediet Abebe +1 more

arXiv:2605.19141v1 Announce Type: cross Abstract: Large language models are increasingly deployed as automated judges to evaluate the strength of arguments. As this role expands, their legitimacy depends on…

AbstractPDF
General MLMay 20, 20262605.19150
Flash PD-SSM: Memory-Optimized Structured Sparse State-Space Models

Aleksandar Terzi\'c, Francesco Carzaniga, Nicolas Menet +4 more

arXiv:2605.19150v1 Announce Type: cross Abstract: State-space models (SSMs) face a fundamental trade-off between efficiency and expressivity that is mainly dictated by the structure of the model's transition matrix.…

AbstractPDF