Exercise - Forecasting with Linear Factor Pricing Models#
Thanks to Tobias Rodriguez del Pozo
Let’s use the “AQR” model in (3), for forecasting excess returns. We will do this at each point in time to build a point-in-time series of forecasts. We will then see how well they perform.
The model does not give us any info about forecasting the factors themselves. Accordingly, calculate the “expanding” mean of the four factors. We will use these as our point-in-time factor premia.
For each of the
nsecurities, estimate (4) over a window of60months. Make sure to estimate these rolling regressions WITH an intercept But we only need to save the beta estimates.For every security, \(i\), and at every month, \(t\) (after the first
60), calculate (3) using the point-in-time factor premia and betas calculated in the prior two steps. This is your forecast made at the end of period \(t\), for \(r^i_{t+1}\). You are using end-of-time \(t\) info in the estimation, so it is a forecast for \(t+ 1\). In order to better align it with our data, shift it ahead a time period. So the dataframe of forecasts has been pushed one month later. (The Feb value is now a March value.) Thus, your forecast timestamp now refers to the time being forecasted, rather than the time it was made.This gives you a series of forecasts \(\widehat{\tilde{r}^i_{t}}\).
In order to decide if these forecasts are good, we need a comparison. Use the point-in-time sample average estimates of \(\tilde{r}_t\). Calculate the expanded mean, and once again, be sure to shift them one period into the future so that the time stamps refer to the period being forecast. This gives us the benchmark forecast: \(\bar{\tilde{r}}_t\)
Compare our Linear Factor Pricing forecasts with the naive forecasts using Out-of-Sample (OOS) R-squared.
where MSE stands for Mean Squared Error.
Warning!#
This calculation will be wrong if your forecasts have NaN values where the benchmark does not. For this reason, it is important to eliminate any date where either series has an NaN value. If you are careful about this issue, then you can write the OOS r-squared as a ratio of SSE.
Data#
Use the data found in data/factor_pricing_data_weekly.xlsx.
Factors: Monthly excess return data for the overall equity market, \(\tilde{r}^{\text{MKT}}\).
The column header to the market factor is
MKTrather thanMKT-RF, but it is indeed already in excess return form.The sheet also contains data on five additional factors.
All factor data is already provided as excess returns
1.#
Report the OOS r-squared for each of the n security forecasts.
2.#
Does the LPM do a good job of forecasting monthly returns? For which asset does it perform best? And worst?
3.#
Re-do the exercise using a window of 36 months. And 96 months. Do either of these windows work better?
4.#
Re-do the exercise using the FF 5-Factor Model instead of the AQR model. Re-do it with the CAPM. Do either of these models improve on forecasting?
Hints#
You may find the following pandas command helpful:
.expanding().mean()You may wish to use from
statsmodels.regression.rolling import RollingOLSThis will take longer to compute: we are estimating a multifactor regression at every month in time and for every security. So we are running roughly
T × Nregressions.See
.shift()in pandas.For instance, if you use the rolling regressions, your initial forecast values will be
NaN. But your expanded mean calculation for the baseline will not have anyNaN. Thus, it is important to require a minimum number of observations in the expanded mean. Or you can more explicitly enforce that both dataframes haveNaNin the same time periods.