Adaptive Deep Learning in Quant Finance with Qlib’s PyTorch AdaRNN
Introduction
AdaRNN is a specialized PyTorch model designed to adaptively learn from non-stationary financial time series—where market distributions evolve over time. Originally proposed in the paper AdaRNN: Adaptive Learning and Forecasting for Time Series, it leverages both GRU layers and transfer-loss techniques to mitigate the effects of distributional shift. This article demonstrates how AdaRNN can be applied within Microsoft’s Qlib—an open-source, AI-oriented platform for quantitative finance.
Background: AdaRNN Methodology
-
Temporal Covariate Shift (TCS)
Market factors may differ drastically from historical data. AdaRNN addresses TCS by adaptively aligning representations over time. -
Two-Phase Design
- Temporal Distribution Characterization:
Better captures distribution information in time-series data. - Temporal Distribution Matching:
Bridges the gap between older and newer data via advanced distribution alignment (e.g., MMD, CORAL, COSINE).
- Temporal Distribution Characterization:
-
Paper & Code References:
AdaRNN’s adaptability makes it relevant for financial forecasting, air-quality prediction, and activity recognition—any scenario where non-stationary data complicates model training.
Qlib Integration
Below is an excerpt from a Qlib-friendly YAML configuration. By running one command, Qlib will:
- Initialize a US market environment (
^GSPC
as benchmark). - Load & transform data (e.g.
Alpha360
) with specialized normalization and label dropping. - Train a custom or placeholder PyTorch “AdaRNN” model.
- Evaluate predictions via correlation metrics and backtesting.
- Generate logs for advanced debugging and iteration.
Command
qrun workflow_config_adarnn_Alpha360.yaml
YAML Snippet
qlib_init:
provider_uri: "/Users/vadimnicolai/Public/work/qlib-cookbook/.data/us_data"
region: us
kernels: 1
market: &market sp500
benchmark: &benchmark ^GSPC
data_handler_config: &data_handler_config
start_time: 2008-01-01
end_time: 2020-08-01
fit_start_time: 2008-01-01
fit_end_time: 2014-12-31
instruments: *market
infer_processors:
- class: RobustZScoreNorm
kwargs:
fields_group: feature
clip_outlier: true
- class: Fillna
kwargs:
fields_group: feature
learn_processors:
- class: DropnaLabel
- class: CSRankNorm
kwargs:
fields_group: label
label: ["Ref($close, -2) / Ref($close, -1) - 1"]
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy
kwargs:
signal: <PRED>
topk: 50
n_drop: 5
backtest:
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
# Demonstration of AdaRNN or a placeholder PyTorch model
class: DNNModelPytorch
module_path: qlib.contrib.model.pytorch_nn
kwargs:
batch_size: 1024
max_steps: 4000
loss: mse
lr: 0.002
optimizer: adam
GPU: 0
pt_model_kwargs:
input_dim: 360
dataset:
class: DatasetH
module_path: qlib.data.dataset
kwargs:
handler:
class: Alpha360
module_path: qlib.contrib.data.handler
kwargs: *data_handler_config
segments:
train: [2008-01-01, 2014-12-31]
valid: [2015-01-01, 2016-12-31]
test: [2017-01-01, 2020-08-01]
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
ana_long_short: False
ann_scaler: 252
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
config: *port_analysis_config
Example Logs & Observations
When you run:
qrun workflow_config_adarnn_Alpha360.yaml
You may see logs such as:
[50515:MainThread](2024-12-23 19:23:44,889) INFO - qlib.ADARNN - ADARNN pytorch version...
d_feat : 6
hidden_size : 64
num_layers : 2
dropout : 0.0
n_epochs : 200
lr : 0.001
metric : loss
batch_size : 800
early_stop : 20
optimizer : adam
loss_type : mse
...
Epoch0: training...
ic/train: 0.016603, mse/train: 0.010481, ...
ic/valid: 0.007023, mse/valid: 0.013398, ...
Epoch1: training...
ic/train: 0.017488, ...
ic/valid: 0.007711, ...
Key Points:
d_feat=6
indicates the model uses 6 features per time step (Alpha360 can have ~360; some examples show 6 for a simpler demonstration).- AdaRNN attempts to adapt to distribution shifts with a specialized gating mechanism, plus distance-based alignment (e.g., MMD, cosine).
- Low or moderate
ic
(information coefficient) values in early epochs typically mean you might tune your features or hyperparams further.
Practical Notes
- Setting
GPU: 0
uses CPU-only mode—suitable for debugging or if CUDA is unavailable. - If distribution shift is severe, consider
'adv'
,'mmd_rbf'
, or'cosine'
in the AdaRNN code base to better handle changing market regimes. - Check memory usage and concurrency settings (
kernels: 1
) if you encounter long training times or segmentation faults.
Data & Requirements
In AdaRNN’s original repository, they discuss:
- Air-quality dataset but the technique extends well to finance.
- Python >= 3.7, PyTorch ~1.6 for best results.
- Ensure your Qlib environment has matching dependencies (e.g.
requirements.txt
pinned versions) to avoid conflicts.
Conclusion
AdaRNN’s adaptive architecture is particularly suited to non-stationary financial data, bridging older regimes to modern ones via distribution matching. Within Qlib:
- You unify data ingestion, factor engineering, neural training, and evaluation into one YAML.
- You easily measure correlation (IC, Rank IC) and simulate real trading costs via PortAnaRecord.
- You can adapt AdaRNN’s gating, hidden-layers, or distribution distances for maximum alpha discovery in shifting markets.
Next Steps:
- Try different distribution-losses (MMD, CORAL,
'adv'
) in AdaRNN to see which best handles your market regime changes. - Combine fundamental, sentiment, or alt data to strengthen “temporal distribution characterization.”
- Evaluate the final portfolio PnL and IR thoroughly in Qlib’s logs, adjusting hyperparameters (
dw
,pre_epoch
,hidden_size
) for improved adaptation.
References
- Du, Yuntao, et al. “AdaRNN: Adaptive Learning and Forecasting for Time Series.” Proceedings of the 30th ACM International Conference on Information & Knowledge Management (CIKM), 2021.
Paper Video | Zhihu (Chinese) - AdaRNN GitHub Repo
- Qlib on GitHub
Pro Tip: If you see low IC or negative alpha, consider rolling or incremental re-training. Fine-tuning AdaRNN’s distribution alignment parameters can be crucial for dealing with abrupt financial market changes.