Expert Systems with Applications, 2026 (SCI-Expanded, Scopus)
Financial markets contain statistically detectable patterns whose economic exploitability remains uncertain. This study introduces the Learnability Threshold the boundary beyond which detectable patterns cannot yield positive net-of-cost returns for AI agents. This study compares a rule-based heuristic with Proximal Policy Optimization (PPO) agents (trained tabula rasa and via imitation) in simulated markets with transaction costs. To ensure robustness, aligned-path evaluations are conducted and richer observation spaces are tested. Results show DRL agents consistently fail to exploit long-memory dynamics, converging to inactivity or loss-making behavior, whereas the heuristic delivers stable risk-adjusted returns. The findings formally distinguish statistical detectability from economic exploitability and reposition DRL as a diagnostic decision-support tool for identifying unexploitable market regimes rather than a standalone profit engine.