Exercise - Forecasting SPY

Exercise - Forecasting SPY#

In this exercise, we build forecast regressions for SPY returns using multiple valuation signals. We then convert these forecasts into investment strategies and evaluate their performance.

A key focus is the distinction between in-sample and out-of-sample forecasting. In-sample results are biased because the model has seen the data it is predicting. Out-of-sample results use only information available at the time of the forecast, giving a more realistic assessment.

Data#

Use the data in data/spy_forecasting_data.xlsx.

spy returns - monthly total returns for SPY
risk-free rate - monthly risk-free rate (T3M)
signals - valuation signals: DP, EP, CAPE, BP, T10YR, CREDIT, VIX

The frequency is monthly (FREQ = 12).

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.linear_model import LinearRegression

from cmds.portfolio import performanceMetrics, tailMetrics, get_ols_metrics

FILEPATH = '../data/spy_forecasting_data.xlsx'
FREQ = 12

SHEET_SPY = 'spy returns'
rets = pd.read_excel(FILEPATH, sheet_name=SHEET_SPY)
rets.set_index('date', inplace=True)

SHEET_RF = 'risk-free rate'
rf = pd.read_excel(FILEPATH, sheet_name=SHEET_RF)
rf.set_index('date', inplace=True)

SHEET_SIGNALS = 'signals'
sigs = pd.read_excel(FILEPATH, sheet_name=SHEET_SIGNALS)
sigs.set_index('date', inplace=True)

display(sigs.tail(3).style.format('{:,.2f}').format_index('{:%Y-%m-%d}'))

Timing Convention#

We estimate a forecast regression:

\[r_{t+1} = \alpha + \boldsymbol{\beta}' \boldsymbol{x}_t + \epsilon_{t+1}\]

The signals at time \(t\) predict returns at \(t+1\). To implement this in pandas:

Shift the signals forward one period (.shift()) to create a lagged version.
Align the lagged signals with returns so that pandas date-matching gives us the correct lag.

Date the forecast by the period being forecasted, not the period it was calculated.

1. In-Sample Forecast Regressions#

1.1#

Create a lagged version of the signals and align it with SPY returns.

Build in-sample forecasts using the following models:

Mean - the expanding mean of SPY returns (shifted one period)
DP - univariate regression on Dividend-Price ratio
EP - univariate regression on Earnings-Price ratio
ALL - multivariate regression on all available signals

Plot the forecasted returns over time for all models.

1.2#

Build investment strategies from the forecasts using:

\[w_t = 100 \cdot \hat{r}_{t+1}\]

The strategy return is:

\[r^{\text{strategy}}_{t+1} = w_t \cdot r^{\text{SPY}}_{t+1}\]

Also include a Passive strategy that simply holds SPY (\(w_t = 1\)).

Report:

Annualized performance metrics (mean, vol, Sharpe) for all strategies
Correlation matrix of strategy returns
Cumulative return plot (log scale)

1.3#

Interpret the results. Which signal appears most useful? How do the active strategies compare to Passive in terms of Sharpe ratio? Are the active strategies highly correlated with each other?

2. Out-of-Sample Forecasting#

The in-sample results are biased. Now re-do the forecast regressions using an expanding window approach so that each forecast only uses data available at the time.

At each date \(t\), estimate the regression using data from the start of the sample through \(t\), then form the forecast for \(t+1\).

Use a burn-in period of 5 years (BURN_YRS = 5) before generating the first forecast.

2.1#

Implement the out-of-sample expanding-window forecasts for the same models: Mean, DP, EP, and ALL.

Plot the OOS forecasts over time.

2.2#

Build the investment strategies from the OOS forecasts (using \(w_t = 100 \cdot \hat{r}_{t+1}\)) and report the same performance metrics as in Section 1.

Report:

OOS \(R^2\) for each model, using the expanding mean as the null forecast:

\[R^2_{\text{OOS}} = 1 - \frac{\text{MSE}_{\text{forecast}}}{\text{MSE}_{\text{null}}}\]

Annualized performance metrics
Cumulative returns (log scale)
Correlation matrix

2.3#

Compare the in-sample and out-of-sample results. Do the signals that looked good in-sample still look good out-of-sample? Is the OOS \(R^2\) positive for any model?

3. Extension: Machine Learning Forecasts#

Compare the OLS approach with machine learning methods. Use all signals as features and compare:

OLS - linear regression (same as ALL above)
Random Forest - with max_depth=5, n_estimators=100
NN - neural network (MLPRegressor with hidden_layer_sizes=(50,))

For the neural network, standardize the features and target before training (using StandardScaler), and inverse-transform the predictions back to the original scale.

3.1#

First, estimate and evaluate the ML models in-sample alongside OLS and the expanding mean.

Report the performance metrics and correlation of the resulting strategies. Plot the cumulative returns.

Note that the ML models may overfit. How does in-sample \(R^2\) compare between OLS and the ML methods?

3.2#

Now estimate the ML models out-of-sample using the same expanding-window approach as Section 2.

Report the same metrics. Do the ML methods outperform OLS out-of-sample? Does the in-sample advantage survive?

Hints#

Use .shift() and .dropna() to create the lagged signals.
For the expanding mean baseline: rets[['SPY']].expanding().mean().shift(1).dropna()
In the OOS loop, at each time \(t\), fit the model on sigs_lag.loc[:t] and spy.loc[:t], then predict using sigs.loc[t] (the unlagged signal at \(t\), which forecasts \(t+1\)).
After the loop, shift the OOS forecasts forward one period so timestamps reflect the date being forecasted.
For the neural network OOS, fit a new StandardScaler at each step using only data through \(t\).
sklearn.ensemble.RandomForestRegressor and sklearn.neural_network.MLPRegressor are the relevant imports.
The OOS ML loop will be slow (many regressions). This is expected.