In this discussion, we focus on how we can break the glass ceiling imposed by classical machine learning techniques in finance. It appears that the greatest limitation to the value we can extract from statistical models does not lie in the models themselves — neither in the data nor in the complexity of the algorithms — but rather in the methodology we use to apply them. In other words, the true bottleneck may be how we employ the model, not the model’s intrinsic capability.
Applying machine learning and other modern statistical techniques to algorithmic trading is uniquely challenging. The problems faced by our community are exclusive to financial markets — and because of that, they are rarely discussed in broader machine learning circles. As a result, classical supervised learning offers very little practical guidance on issues that matter to our community. One of the most overlooked issues in our field is the fact that, when modeling financial data, we do not have a fixed target. This might not seem problematic at first, but it is.
To illustrate let us think about how these models are applied in medicine — the reader should remember that medicine is the domain from which many supervised learning techniques originally emerged, and our community is "borrowing" these techniques. In medicine, the target variable is definite and well-defined. A doctor might want to classify a patient as either having cancer or not — a binary classification problem with a clear and immutable label. The doctor’s objective never changes, and the target is grounded in physical reality. Moreover, medical models operate within natural constraints — biological, ethical, or procedural — that give the learning problem a consistent structure.
In contrast, the financial domain lacks such structure. As algorithmic traders, we have no fixed definition of the target. We can model the market in terms of annual returns, daily returns, 15-minute returns, annual price appreciation, maximum drawdowns, volatility, or even relative movement between assets. There are, in fact, infinitely many ways to define what the “target” means in a trading context. And although these targets are all derived from the same underlying data, some targets are far more difficult to forecast than others.
Check out the new article: Reimagining Classic Strategies (Part 17): Modelling Technical Indicators.
In this discussion, we focus on how we can break the glass ceiling imposed by classical machine learning techniques in finance. It appears that the greatest limitation to the value we can extract from statistical models does not lie in the models themselves — neither in the data nor in the complexity of the algorithms — but rather in the methodology we use to apply them. In other words, the true bottleneck may be how we employ the model, not the model’s intrinsic capability.
Applying machine learning and other modern statistical techniques to algorithmic trading is uniquely challenging. The problems faced by our community are exclusive to financial markets — and because of that, they are rarely discussed in broader machine learning circles. As a result, classical supervised learning offers very little practical guidance on issues that matter to our community. One of the most overlooked issues in our field is the fact that, when modeling financial data, we do not have a fixed target. This might not seem problematic at first, but it is.
To illustrate let us think about how these models are applied in medicine — the reader should remember that medicine is the domain from which many supervised learning techniques originally emerged, and our community is "borrowing" these techniques. In medicine, the target variable is definite and well-defined. A doctor might want to classify a patient as either having cancer or not — a binary classification problem with a clear and immutable label. The doctor’s objective never changes, and the target is grounded in physical reality. Moreover, medical models operate within natural constraints — biological, ethical, or procedural — that give the learning problem a consistent structure.
In contrast, the financial domain lacks such structure. As algorithmic traders, we have no fixed definition of the target. We can model the market in terms of annual returns, daily returns, 15-minute returns, annual price appreciation, maximum drawdowns, volatility, or even relative movement between assets. There are, in fact, infinitely many ways to define what the “target” means in a trading context. And although these targets are all derived from the same underlying data, some targets are far more difficult to forecast than others.
Author: Gamuchirai Zororo Ndawana