While Large Language Model (LLM) agents show promise in automated trading, they still face critical limitations. Prominent multi-agent frameworks often suffer from inefficiency, produce inconsistent signals, and lack the end-to-end optimization required to learn a coherent strategy from market feedback. To address this, we introduce AlphaQuanter, a single-agent framework that uses reinforcement learning (RL) to learn a dynamic policy over a transparent, tool-augmented decision workflow, which empowers a single agent to autonomously orchestrate tools and proactively acquire information on demand, establishing a transparent and auditable reasoning process. Extensive experiments demonstrate that AlphaQuanter achieves state-of-the-art performance on key financial metrics. Moreover, its interpretable reasoning reveals sophisticated strategies, offering novel and valuable insights for human traders.
Key Observations
The training dynamics reveal that while both models learn, the 7B model enters a sophisticated policy refinement phase, whereas the 3B model converges prematurely to a simplistic strategy.
Validation performance confirms the 7B model's superior generalization, as it not only improves returns but also learns to effectively manage downside risk (decreasing MDD), a crucial capability the 3B model fails to acquire.
Evolution of tool-selection strategies for AlphaQuanter-3B and -7B during training, where heatmap intensity indicates percentile-based reliance on four data sources ([M], [S], [X], [F]).
Key Observations
Our ablation study validates the critical contribution of our reward components and the high sensitivity of the decision threshold (θ) to the agent's overall performance and strategic behavior.
@misc{deng2025alphaquanterendtoendtoolorchestratedagentic,
title={AlphaQuanter: An End-to-End Tool-Orchestrated Agentic Reinforcement Learning Framework for Stock Trading},
author={Zheye Deng and Jiashu Wang},
year={2025},
eprint={2510.14264},
archivePrefix={arXiv},
primaryClass={cs.CE},
url={https://arxiv.org/abs/2510.14264},
}