The Reversed Time
The Reversed Time
Double Machine Learning (DML) separates nuisance estimation from causal parameter estimation using cross-fitting: split the data in half, estimate the nuisance function on one half, evaluate the causal parameter on the other. This prevents overfitting from contaminating the causal estimate. In cross-sectional data, where observations are exchangeable, the split is arbitrary — any random partition works.
Time series are not exchangeable. Past depends on past. A random split destroys the temporal structure that makes the data informative. The standard DML framework doesn’t apply.
Reverse cross-fitting exploits a specific property of stationary time series: reversibility. A stationary process run backward in time has the same distributional properties as the process run forward. This means you can train the nuisance model on the reversed series and evaluate it on the forward series — or vice versa — without introducing the temporal contamination that makes naive splitting fail.
The technique identifies a “Goldilocks zone” of tuning parameters: the bandwidth of the nuisance estimator must be large enough to average over noise but small enough to track the signal. Within this zone, reverse cross-fitting eliminates the small-sample bias that plagues DML in macroeconomic applications where the sample size is a few hundred quarters, not a few hundred thousand observations.
The through-claim: the time-reversibility of stationary processes is not a mathematical curiosity but a computational resource. Running time backward creates the independence between training and evaluation sets that cross-fitting normally achieves by spatial partitioning. The reversed series is a different “split” of the same data — different enough to prevent overfitting, similar enough to preserve the distributional structure. Time reversal does the work that randomization can’t.