GMO Forecasting#

Case: Grantham, Mayo, and Van Otterloo, 2012: Estimating the Equity Risk Premium [9-211-051].

\[ \newcommand{\E}{\mathbb{E}} \newcommand{\cond}{\, |\, } \newcommand{\var}{\text{var}} \newcommand{\cov}{\text{cov}} \newcommand{\corr}{\text{corr}} \newcommand{\std}{\text{std}} \newcommand{\covmat}{\boldsymbol{\Sigma}} \newcommand{\cdf}{\Phi} \newcommand{\Normal}{\mathcal{N}} \newcommand{\lp}{\mathbb{L}} \newcommand{\dlim}{\overset{D}{\to \;}} \newcommand{\plim}{\overset{P}{\to \;}} \newcommand{\iid}{i.i.d.} \newcommand{\free}{f} \newcommand{\ex}[1]{\tilde{#1}} \newcommand{\R}[1][]{R^{#1}} \newcommand{\Rf}{\R[\free]} \newcommand{\Rx}[1][]{\ex{R}^{#1}} \renewcommand{\r}[1][]{r^{\scriptscriptstyle {#1}}} \newcommand{\rf}{\r[\free]} \newcommand{\rx}[1][]{\ex{r}^{\scriptscriptstyle {#1}}} \newcommand{\rlog}[1][]{{\texttt{r}^{#1}}} \newcommand{\rflog}{\rlog[\free]} \newcommand{\rvec}[1][]{\boldsymbol{\r[#1]}} \newcommand{\rxvec}[1][]{\boldsymbol{\rx[#1]}} \newcommand{\pay}[1][]{\Gamma^{\scriptscriptstyle {#1}}} \renewcommand{\P}{\mathcal{P}} \newcommand{\ind}[1]{_{[#1]}} \newcommand{\notind}[1]{\ind{-#1}} \newcommand{\coord}{\boldsymbol{\iota}} \newcommand{\obs}{n} \newcommand{\Nobs}{N} \newcommand{\lag}{h} \newcommand{\Nlag}{H} \newcommand{\indx}{i} \newcommand{\indxalt}{j} \newcommand{\I}{\mathcal{I}} \newcommand{\one}{\textbf{1}} \newcommand{\zeros}{\textbf{0}} \newcommand{\x}{\textbf{x}} \newcommand{\z}{\textbf{z}} \newcommand{\y}{\textbf{y}} \newcommand{\w}{\textbf{w}} \newcommand{\X}{\textbf{X}} \newcommand{\Z}{\textbf{Z}} \newcommand{\Y}{\textbf{Y}} \newcommand{\W}{\textbf{W}} \newcommand{\alphavec}{\boldsymbol{\alpha}} \newcommand{\betavec}{\boldsymbol{\beta}} \newcommand{\epsilonvec}{\boldsymbol{\epsilon}} \newcommand{\sigmavec}{\boldsymbol{\sigma}} \newcommand{\mux}{\ex{\mu}} \newcommand{\muvec}{\boldsymbol{\mu}} \newcommand{\muxvec}{\boldsymbol{\ex{\mu}}} \newcommand{\muP}{\mu^p} \newcommand{\Sigmamat}{\boldsymbol{\Sigma}} \newcommand{\wt}{\boldsymbol{\omega}} \newcommand{\wtx}{\boldsymbol{w}} \newcommand{\wtxstar}{\wtx^*} \newcommand{\wtTan}{\wt^{\tan}} \newcommand{\wtxTan}{\wtx^{\tan}} \newcommand{\wtGMV}{\wt^{\gmv}} \newcommand{\mv}{\scriptscriptstyle {\subset}} \newcommand{\MV}{$\ex{\text{MV}}\ $} \newcommand{\MVscale}{\delta} \newcommand{\ifac}{k} \newcommand{\Nfacs}{k} \newcommand{\fbeta}[1][]{\beta^{\scriptscriptstyle {#1}}} \newcommand{\fbetavec}[1][]{\betavec^{\scriptscriptstyle {#1}}} \newcommand{\radon}{q} \newcommand{\mkt}{m} \newcommand{\act}{a} \renewcommand{\tan}{\texttt{t}} \newcommand{\gmv}{\texttt{v}} \newcommand{\size}{s} \newcommand{\val}{v} \newcommand{\up}{u} \newcommand{\mom}{\text{mom}} \newcommand{\orthog}{\alpha} \newcommand{\fac}{z} \newcommand{\facs}{\boldsymbol{z}} \newcommand{\facmac}{f} \newcommand{\facsmac}{\boldsymbol{f}} \newcommand{\prem}{\lambda} \newcommand{\premvec}{\boldsymbol{\prem}} \newcommand{\pos}{i} \newcommand{\hedge}{j} \newcommand{\etavec}{\boldsymbol{\eta}} \newcommand{\Q}{\textbf{Q}} \newcommand{\eig}{\psi} \newcommand{\Eig}{\Psi} \newcommand{\eigv}{\textbf{q}} \newcommand{\Eigv}{Q} \newcommand{\PCwt}{\textbf{q}} \newcommand{\PCfac}{x} \newcommand{\VaR}{\text{VaR}} \newcommand{\ES}{\text{ES}} \newcommand{\Rrate}{\,\r} \newcommand{\Rratevec}{\,\rvec} \newcommand{\RVaR}{\, \r[\text{VaR}]} \newcommand{\RES}{\,\r[\text{ES}]} \newcommand{\rVaR}{\, \r[\text{VaR}]} \newcommand{\rES}{\, \r[\text{ES}]} \newcommand{\thresh}{\pi} \newcommand{\quantile}{\texttt{z}_{\thresh}} \newcommand{\gain}{\Delta V} \newcommand{\loss}{L} \newcommand{\lossvec}{\textbf{L}} \newcommand{\cdfnorm}{\Phi} \newcommand{\cdflosst}{F^\ell_\tau} \newcommand{\cdfgaint}{F^g_\tau} \newcommand{\cdfretst}{F^{\r}_\tau} \newcommand{\invcdflosst}{F_\tau^{\ell(-1)}} \newcommand{\invcdfgaint}{F_\tau^{g[-1]}} \newcommand{\invcdfretst}{F^{\r(-1)}_\tau} \newcommand{\mawt}{\theta} \newcommand{\DP}{\text{DP}} \newcommand{\n}{{(n)}} \newcommand{\0}{(0)} \newcommand{\1}{{(1)}} \newcommand{\2}{{(2)}} \newcommand{\3}{{(3)}} \newcommand{\4}{{(4)}} \newcommand{\5}{{(5)}} \newcommand{\bonds}{B} \newcommand{\fx}{S} \newcommand{\fxlog}{\texttt{s}} \newcommand{\meuro}{\text{\euro}} \newcommand{\usd}{\$} \newcommand{\for}{F} \newcommand{\forlog}{\texttt{f}} \]

1 READING: GMO#

This section is not graded, and you do not need to submit your answers. But you are expected to consider these issues and be ready to discuss them.

  1. GMO’s approach.

    • Why does GMO believe they can more easily predict long‑run than short‑run asset‑class performance?

GMO believes that “in the short run, the market is a voting machine, but in the long run, the market is a weighing machine”. So specifically, they thing that in the long run, asset returns should converge to their fundamental values (“steady state”), but that in the short run there may be significant deviation from said values due to noise. They also believe that there is a long-run equity risk premia, ie. that equities should outperform bonds in the long run because of the “inconvenient return path” (they tend to lose value when you least want them to).

  • What predicting variables does the case mention are used by GMO? Does this fit with the goal of long‑run forecasts?

The case mentions dividend yield, P/E (Price/Earnings) multiple expansion and contraction, sales growth, and profit margin as predicting variables used by GMO. GMO believes that profit margin and P/E multiple should be stable in the long-run, and that long-run returns are principally driven by sales growth and required dividend yield. They also used the “Gordon Growth Model” as a basis for their forecasts.

This does fit the goal of long-run forecasts; using things other than purely price (so more fundamental drivers) helps avoid market noise that may affect short-term prices but not long-term values.

  • How has this approach led to contrarian positions?

Because they think that P/E multiples and dividend yield should be stable and reverting to some fundamental value, it means that when equity markets have elevated multiples or are frothy (lots of buyers), they will take an underweight position.

They were also typically more conservative, but took large risks when the “fat pitch” presented itself. They also heavily believed in a value-oriented approach to asset allocation, and had a lot of dry powder to deploy when the opportunity came (cash).

  • How does this approach raise business risk and managerial career risk?

The first risk, business risk, is a result of a fund needing to secure capital long enough to see its thesis become realized. As a contrarian with a long term perspective, GMO is likely to suffer from severe underperformance while it waits for its thesis to play out. Many investors are not patient enough and may withrdraw their money, leaving GMO with no ability to function at all. In particular, GMO was bearish from 1997 and 2000, and lost 60% of their assets due to withdrawals.

Secondly, there is career risk. The risk here is that many investment professionals are driven by their concern for their position, which is largely determined by short(er) term performance. One may be reluctant to stand out if the risk of them, and them alone, being wrong leads to them being fired.

  1. The market environment.

    • We often estimate the market risk premium by looking at a large sample of historic data. What reasons does the case give to be skeptical that the market risk premium will be as high in the future as it has been over the past 50 years?

GMO had a bearish outlook on stocks, with stocks only outperforming bonds by 1.6% over the next 7 years (as of 2011, Exhibit 10). The case mentions that even after 2008, PE ratios were at 19.9, well above the long-run average of 16. They were also “skeptical that US firms could sustain the record profit margins they had delivered since 2009, GMO was also pessimistic about future earnings growth. However, over the longer run, GMO was confident that stocks would continue to earn a healthy risk premium. Inker thought that ‘reports of the death of equities had been greatly exagerated.’”

  • In 2007, GMO forecasts real excess equity returns will be negative. What are the biggest drivers of their pessimistic conditional forecast relative to the unconditional forecast? (See Exhibit 9.)

From exhibit 9, they expected P/E to contract by 2.8% over the next 7 years, they also expected profit margins to contract by 3.9%. These were the two biggest drivers of their pessimistic forecast.

For the unconditional (steady state) forecast, they expected no change in P/E or profit margins relative to their historic steady states (16 and 6%, respectively).

  • In the 2011 forecast, what components has GMO revised most relative to 2007? Now how does their conditional forecast compare to the unconditional? (See Exhibit 10.)

In 2011, they expected PE ratios to not change (0.0%), a big revision from the -2.8% in 2007. In 2011 the PE ratio was 15, and they actually downgraded their unconditional forecast down 15 (from 16). They also slightly revised their profit margin contraction to 3.7%, up from 3.9% in 2007. Finally, they revised their expected sales growth per share up to 2.9%, up from 2.4% in 2007. This meant that their overall forecast was now 1.6% excess return over bonds, up signfificantly from -3.9% in 2007.

  1. Consider the asset‑class forecasts in Exhibit 1.

    • Which asset class did GMO estimate to have a negative 10‑year return over 2002–2011? They only expected the S&P 500 to have a negative return of ~-1% per year over the next 10 years.

    • Which asset classes substantially outperformed GMO’s estimate over that time period? Foreign government bonds, and emerging market equities. For foreign government bonds, they expected a return of ~3% but they returned 6%, and for emerging market equities they expected ~9.5% but they returned ~11.5%. There’s also an argument to be made for US large cap equities, they expected -1% but they actually returned about 0.5%.

    • Which asset classes substantially underperformed GMO’s estimate over that time period? US Treasury bills, forecast 2% but returns -0.5%. Also, US REITs, they forecast a little over 8% but they returned ~6.5%.

  2. Fund performance.

    • In which asset class was GMWAX most heavily allocated throughout the majority of 1997–2011?

US Fixed Income by far, followed by international equities and then US equities.
  • Comment on the performance of GMWAX versus its benchmark. (No calculation needed; simply comment on the comparison in the exhibits.)

GMWAX was very successful relative to its benchmark, with double the returns and only 70% of the volatility, so the Sharpe ratio was more than double. Worth noting though that in my opinion this isn't really good, the Sharpe ratio was 0.46, which is extremely mediocre in absolute terms.
# Imports and such
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import statsmodels.api as sm


plt.style.use("bmh")
plt.rcParams["figure.figsize"] = (8, 5)


def calc_return_metrics(data, as_df=False, adj=12):
    """
    Calculate return metrics for a DataFrame of assets.

    Args:
        data (pd.DataFrame): DataFrame of asset returns.
        as_df (bool, optional): Return a DF or a dict. Defaults to False (return a dict).
        adj (int, optional): Annualization. Defaults to 12.

    Returns:
        Union[dict, DataFrame]: Dict or DataFrame of return metrics.
    """
    summary = dict()
    summary["Annualized Return"] = data.mean() * adj
    summary["Annualized Volatility"] = data.std() * np.sqrt(adj)
    summary["Annualized Sharpe Ratio"] = (
        summary["Annualized Return"] / summary["Annualized Volatility"]
    )
    summary["Annualized Sortino Ratio"] = summary["Annualized Return"] / (
        data[data < 0].std() * np.sqrt(adj)
    )
    return pd.DataFrame(summary, index=data.columns) if as_df else summary


def calc_risk_metrics(data, as_df=False, var=0.05):
    """
    Calculate risk metrics for a DataFrame of assets.

    Args:
        data (pd.DataFrame): DataFrame of asset returns.
        as_df (bool, optional): Return a DF or a dict. Defaults to False.
        adj (int, optional): Annualizatin. Defaults to 12.
        var (float, optional): VaR level. Defaults to 0.05.

    Returns:
        Union[dict, DataFrame]: Dict or DataFrame of risk metrics.
    """
    summary = dict()
    summary["Skewness"] = data.skew()
    summary["Excess Kurtosis"] = data.kurtosis()
    summary[f"VaR ({var})"] = data.quantile(var, axis=0)
    summary[f"CVaR ({var})"] = data[data <= data.quantile(var, axis=0)].mean()
    summary["Min"] = data.min()
    summary["Max"] = data.max()

    wealth_index = 1000 * (1 + data).cumprod()
    previous_peaks = wealth_index.cummax()
    drawdowns = (wealth_index - previous_peaks) / previous_peaks

    summary["Max Drawdown"] = drawdowns.min()

    summary["Bottom"] = drawdowns.idxmin()
    summary["Peak"] = previous_peaks.idxmax()

    recovery_date = []
    for col in wealth_index.columns:
        prev_max = previous_peaks[col][: drawdowns[col].idxmin()].max()
        recovery_wealth = pd.DataFrame([wealth_index[col][drawdowns[col].idxmin() :]]).T
        recovery_date.append(
            recovery_wealth[recovery_wealth[col] >= prev_max].index.min()
        )
    summary["Recovery"] = ["-" if pd.isnull(i) else i for i in recovery_date]

    summary["Duration (days)"] = [
        (i - j).days if i != "-" else "-"
        for i, j in zip(summary["Recovery"], summary["Bottom"])
    ]

    return pd.DataFrame(summary, index=data.columns) if as_df else summary


def calc_performance_metrics(data, adj=12, var=0.05):
    """
    Aggregating function for calculating performance metrics. Returns both
    risk and performance metrics.

    Args:
        data (pd.DataFrame): DataFrame of asset returns.
        adj (int, optional): Annualization. Defaults to 12.
        var (float, optional): VaR level. Defaults to 0.05.

    Returns:
        DataFrame: DataFrame of performance metrics.
    """
    summary = {
        **calc_return_metrics(data=data, adj=adj),
        **calc_risk_metrics(data=data, var=var),
    }
    summary["Calmar Ratio"] = summary["Annualized Return"] / abs(
        summary["Max Drawdown"]
    )
    return pd.DataFrame(summary, index=data.columns)


rets = pd.read_excel(
    "gmo_data.xlsx", sheet_name="total returns", index_col="date", parse_dates=True
)
rfr = (
    pd.read_excel(
        "gmo_data.xlsx", sheet_name="risk-free rate", index_col="date", parse_dates=True
    )
    / 12
)

# Calculate excess returns
retsx = rets.subtract(rfr["TBill 3M"], axis=0)

2 Analyzing GMO#

This section utilizes data in the file gmo_data.xlsx. Convert total returns to excess returns using the risk‑free rate.

  1. Performance (GMWAX). Compute mean, volatility, and Sharpe ratio for GMWAX over three samples:

    • inception → 2011

    • 2012 → present

    • inception → present
      Has the mean, vol, and Sharpe changed much since the case?

  2. Tail risk (GMWAX). For all three samples, analyze extreme scenarios:

    • minimum return

    • 5th percentile (VaR‑5th)

    • maximum drawdown (compute on total returns, not excess returns)
      (a) Does GMWAX have high or low tail‑risk as seen by these stats?
      (b) Does that vary much across the two subsamples?

retsx_s1 = retsx.loc[:"2011"]
retsx_s2 = retsx.loc["2012":]

# For each sample + overall calculate performance metrics and display nicely
metrics_s1 = calc_performance_metrics(retsx_s1[["GMWAX", "GMGEX"]]).T.rename(
    columns=lambda x: f"{x} (Start-2011)"
)

metrics_s2 = calc_performance_metrics(retsx_s2[["GMWAX", "GMGEX"]]).T.rename(
    columns=lambda x: f"{x} (2012-Present)"
)

metrics_overall = calc_performance_metrics(retsx[["GMWAX", "GMGEX"]]).T.rename(
    columns=lambda x: f"{x} (Start-Present)"
)

# Repeat this exercise to extract the max drawdown on total returns.
rets_s1 = rets.loc[:"2011"]
rets_s2 = rets.loc["2012":]

metrics_s1_dd = (
    calc_performance_metrics(rets_s1[["GMWAX", "GMGEX"]])
    .T.rename(columns=lambda x: f"{x} (Start-2011)")
    .loc[["Max Drawdown", "Bottom", "Peak", "Recovery", "Duration (days)"], :]
)
metrics_s2_dd = (
    calc_performance_metrics(rets_s2[["GMWAX", "GMGEX"]])
    .T.rename(columns=lambda x: f"{x} (2012-Present)")
    .loc[["Max Drawdown", "Bottom", "Peak", "Recovery", "Duration (days)"], :]
)
metrics_overall_dd = (
    calc_performance_metrics(rets[["GMWAX", "GMGEX"]])
    .T.rename(columns=lambda x: f"{x} (Start-Present)")
    .loc[["Max Drawdown", "Bottom", "Peak", "Recovery", "Duration (days)"], :]
)

metrics = pd.concat([metrics_s1, metrics_s2, metrics_overall], axis=1)
metrics.loc[["Max Drawdown", "Bottom", "Peak", "Recovery", "Duration (days)"], :] = (
    pd.concat([metrics_s1_dd, metrics_s2_dd, metrics_overall_dd], axis=1)
)

metrics
GMWAX (Start-2011) GMGEX (Start-2011) GMWAX (2012-Present) GMGEX (2012-Present) GMWAX (Start-Present) GMGEX (Start-Present)
Annualized Return 0.046422 -0.003823 0.043423 0.001311 0.045043 -0.001463
Annualized Volatility 0.110499 0.147253 0.094949 0.235554 0.10349 0.192622
Annualized Sharpe Ratio 0.42011 -0.025963 0.457326 0.005566 0.43524 -0.007595
Annualized Sortino Ratio 0.52979 -0.035968 0.658023 0.004641 0.573957 -0.007213
Skewness -0.891709 -0.509564 -0.507077 -6.028372 -0.758222 -5.131245
Excess Kurtosis 3.058298 0.672829 1.945528 57.473216 2.771186 58.272763
VaR (0.05) -0.044003 -0.082292 -0.040854 -0.068027 -0.041368 -0.076213
CVaR (0.05) -0.074072 -0.09856 -0.058858 -0.162657 -0.068849 -0.130719
Min -0.14915 -0.151592 -0.115018 -0.658863 -0.14915 -0.658863
Max 0.081877 0.096042 0.074458 0.124668 0.081877 0.124668
Max Drawdown -0.293614 -0.55563 -0.216795 -0.737364 -0.293614 -0.761812
Bottom 2009-02-27 00:00:00 2009-02-27 00:00:00 2022-09-30 00:00:00 2016-11-30 00:00:00 2009-02-27 00:00:00 2016-11-30 00:00:00
Peak 2011-04-29 00:00:00 2007-10-31 00:00:00 2024-09-30 00:00:00 2014-06-30 00:00:00 2024-09-30 00:00:00 2007-10-31 00:00:00
Recovery 2010-10-29 00:00:00 - 2024-02-29 00:00:00 - 2010-10-29 00:00:00 -
Duration (days) 609 - 517 - 609 -
Calmar Ratio 0.151448 -0.006779 0.192465 0.001776 0.14695 -0.001905

Has the mean/vol/Sharpe changed much since the case?

For GMWAX, not really. The mean went up a bit, the volatility went down a bit, and so the Sharpe ratio went up a bit, but not really by much.

(a) Does GMWAX have high or low tail‑risk as seen by these stats?

Seems like pretty low tailrisk, VaR of only 4ish%, max drawdown of ~30% (not super great), and min return of 11-15%.

(b) Does that vary much across the two subsamples?

Yes, the tail risk metrics are better in 2012-present than in inception-2011. The max drawdown is only 21% compared to 29%, and the VaR is 4% compared to 4.4%, and also the min return is 11% compared to 15%.
  1. Market exposure (GMWAX). For all three samples, regress excess returns of GMWAX on excess returns of SPY:

    • report estimated alpha, beta, and

    • is GMWAX a low‑beta strategy? has that changed since the case?

    • does GMWAX provide alpha? has that changed across subsamples?

summary = {
    "SPY Beta": [],
    "SPY Alpha": [],
    "R2": [],
}

for sample in [retsx_s1, retsx_s2, retsx]:
    X = sm.add_constant(sample["SPY"])
    for fund in ["GMWAX", "GMGEX"]:
        y = sample[fund]
        model = sm.OLS(y, X).fit()
        summary["SPY Beta"].append(model.params["SPY"])
        summary["SPY Alpha"].append(model.params["const"] * 12)  # Annualized
        summary["R2"].append(model.rsquared)

summary_df = pd.DataFrame(
    summary,
    index=[
        "GMWAX (Start-2011)",
        "GMGEX (Start-2011)",
        "GMWAX (2012-Present)",
        "GMGEX (2012-Present)",
        "GMWAX (Start-Present)",
        "GMGEX (Start-Present)",
    ],
    columns=["SPY Beta", "SPY Alpha", "R2"],
)
summary_df
SPY Beta SPY Alpha R2
GMWAX (Start-2011) 0.542128 0.027000 0.648686
GMGEX (Start-2011) 0.764237 -0.031201 0.725898
GMWAX (2012-Present) 0.581793 -0.033960 0.748747
GMGEX (2012-Present) 0.838118 -0.110164 0.252468
GMWAX (Start-Present) 0.552608 0.000558 0.680167
GMGEX (Start-Present) 0.786683 -0.064790 0.397891

Is GMWAX a low‑beta strategy? has that changed since the case?

I think it depends. Given the context of the case (value investors, long term, etc.) I would say yes they are low beta. Specifically if we think about the context of being primarily long-only equity investors, their beta is between 0.54 and 0.58, which is reasonably low.

However, if we compare this to a long-short equity fund, their beta is quite high, i.e. if you’re running a hedge fund you should have a beta that is extremely close to 0.

Their beta has barely changed since the case and is pretty constant.

Does GMWAX provide alpha? has that changed across subsamples?

It depends. From inception to 2011, they provided 2.7% annualized alpha, which is good. However, from 2012 to present they actually provide negative 3.4% alpha, which is bad. Overall they provide close to 0 alpha across the full sample (0.0005, or 5 basis points).
  1. Compare to GMGEX. Repeat items 1–3 for GMGEX. What are key differences between the two strategies?

1-3 for GMGEX are above. My conclusion here is that GMGEX just sucks. Specifically, it has negative returns across the full sample, much higher market beta, worse maximum drawdown (76%!). The key difference in the strategies is that GMGEX only aims to beat the MSCI All Country World Index link, whereas GMWAX is 65% MSCI All Country World Index and 35% Bloomberg US Aggregate Index link.

3 Forecast Regressions#

This section utilizes data in gmo_analysis_data.xlsx.

  1. Lagged regression. Consider the regression with predictors lagged one period: $\( r^{SPY}_{t} \;=\; \alpha^{SPY,X} \;+\; \big(\beta^{SPY,X}\big)^\prime X_{t-1} \;+\; \epsilon^{SPY,X}_{t} \tag{1} \)$

Estimate (1) and report the \(R^2\), as well as the OLS estimates for \(\alpha\) and \(\beta\). Do this for:

  • \(X\) as a single regressor, the dividend–price ratio (\(DP\))

  • \(X\) as a single regressor, the earnings–price ratio (\(EP\))

  • \(X\) with three regressors: \(DP\), \(EP\), and the 10‑year yield
    For each, report the \(R^2\).

Note: For this section, there is some ambiguity as to whether we should use excess or total returns. Here, I will use excess returns because this is most pertinent to forecasting. If I feel confident that I can predict the risk free rate, then we should strip this out and focus on modelling the less-predictable excess returns on their own.
signals = pd.read_excel(
    "gmo_analysis_data.xlsx", sheet_name="signals", index_col="date", parse_dates=True
).loc[:"2024-10-31"]

3.1 D/P Forecast#

y = retsx["SPY"]
x = sm.add_constant(signals["SPX D/P"]).shift(1)

model_dp = sm.OLS(y, x, missing="drop").fit()

model_dp.summary()
OLS Regression Results
Dep. Variable: SPY R-squared: 0.014
Model: OLS Adj. R-squared: 0.011
Method: Least Squares F-statistic: 4.727
Date: Mon, 17 Nov 2025 Prob (F-statistic): 0.0304
Time: 05:04:12 Log-Likelihood: 567.58
No. Observations: 334 AIC: -1131.
Df Residuals: 332 BIC: -1124.
Df Model: 1
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
const -0.0171 0.011 -1.519 0.130 -0.039 0.005
SPX D/P 1.3187 0.607 2.174 0.030 0.126 2.512
Omnibus: 27.416 Durbin-Watson: 1.957
Prob(Omnibus): 0.000 Jarque-Bera (JB): 35.086
Skew: -0.627 Prob(JB): 2.40e-08
Kurtosis: 3.975 Cond. No. 250.


Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

3.2 EP Forecast#

y = retsx["SPY"]
x = sm.add_constant(signals["SPX E/P"]).shift()

model_ep = sm.OLS(y, x, missing="drop").fit()

model_ep.summary()
OLS Regression Results
Dep. Variable: SPY R-squared: 0.007
Model: OLS Adj. R-squared: 0.004
Method: Least Squares F-statistic: 2.418
Date: Mon, 17 Nov 2025 Prob (F-statistic): 0.121
Time: 05:04:12 Log-Likelihood: 566.44
No. Observations: 334 AIC: -1129.
Df Residuals: 332 BIC: -1121.
Df Model: 1
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
const -0.0095 0.011 -0.881 0.379 -0.031 0.012
SPX E/P 0.2991 0.192 1.555 0.121 -0.079 0.678
Omnibus: 24.617 Durbin-Watson: 1.960
Prob(Omnibus): 0.000 Jarque-Bera (JB): 30.294
Skew: -0.594 Prob(JB): 2.64e-07
Kurtosis: 3.874 Cond. No. 79.2


Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

3.3 All Three Forecast (DP, EP, 10Y)#

y = retsx["SPY"]
x = sm.add_constant(signals).shift()

model_all = sm.OLS(y, x, missing="drop").fit()

model_all.summary()
OLS Regression Results
Dep. Variable: SPY R-squared: 0.017
Model: OLS Adj. R-squared: 0.008
Method: Least Squares F-statistic: 1.944
Date: Mon, 17 Nov 2025 Prob (F-statistic): 0.122
Time: 05:04:12 Log-Likelihood: 568.15
No. Observations: 334 AIC: -1128.
Df Residuals: 330 BIC: -1113.
Df Model: 3
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
const -0.0069 0.016 -0.417 0.677 -0.039 0.026
SPX D/P 0.6594 0.937 0.704 0.482 -1.183 2.502
SPX E/P 0.1630 0.265 0.616 0.538 -0.358 0.684
T-Note 10YR -0.2034 0.198 -1.026 0.306 -0.593 0.187
Omnibus: 26.366 Durbin-Watson: 1.967
Prob(Omnibus): 0.000 Jarque-Bera (JB): 33.195
Skew: -0.616 Prob(JB): 6.19e-08
Kurtosis: 3.932 Cond. No. 396.


Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
  1. Trading strategy from forecasts. For each of the three regressions:

    • Build the forecasted SPY return: \(\hat r^{SPY}_{t+1}\) (forecast made using \(X_t\) to predict \(r^{SPY}_{t+1}\)).

    • Set the scale (portfolio weight) to \(w_t = 100 \,\hat r^{SPY}_{t+1}\).

    • Strategy return: \(r^x_{t+1} = w_t\, r^{SPY}_{t+1}\).
      For each strategy, compute:

    • mean, volatility, Sharpe

    • max drawdown

    • market alpha

    • market beta

    • market information ratio

# Build the return predictions, and align them with the right period
all_prediction = model_all.predict(sm.add_constant(signals)).shift()
dp_prediction = model_dp.predict(sm.add_constant(signals["SPX D/P"])).shift()
ep_prediction = model_ep.predict(sm.add_constant(signals["SPX E/P"])).shift()

# Build the strategy weighting in SPY
all_weight = 100 * all_prediction
dp_weight = 100 * dp_prediction
ep_weight = 100 * ep_prediction

# Compute the strategy returns
all_strat = all_weight * retsx["SPY"]
dp_strat = dp_weight * retsx["SPY"]
ep_strat = ep_weight * retsx["SPY"]

# Univariate risks
strat_metrics = calc_performance_metrics(
    pd.DataFrame({"All Three": all_strat, "D/P Only": dp_strat, "E/P Only": ep_strat})
).T
strat_metrics.loc[
    [
        "Annualized Return",
        "Annualized Volatility",
        "Annualized Sharpe Ratio",
        "Max Drawdown",
    ],
    :,
]
All Three D/P Only E/P Only
Annualized Return 0.097024 0.089093 0.072883
Annualized Volatility 0.163234 0.165365 0.133875
Annualized Sharpe Ratio 0.594384 0.538765 0.544407
Max Drawdown -0.665389 -0.724389 -0.588043
summary = {
    "Beta": [],
    "Alpha (Annualized)": [],
    "Information Ratio (Annualized)": [],
}

for strat in [all_strat, dp_strat, ep_strat]:
    X = sm.add_constant(retsx["SPY"])
    y = strat
    model = sm.OLS(y, X, missing="drop").fit()
    summary["Beta"].append(model.params["SPY"])
    summary["Alpha (Annualized)"].append(model.params["const"] * 12)
    residuals = model.resid
    ir = (model.params["const"] * 12) / (residuals.std() * np.sqrt(12))
    summary["Information Ratio (Annualized)"].append(ir)

summary_df = pd.DataFrame(
    summary,
    index=["All Three", "D/P Only", "E/P Only"],
)
summary_df
Beta Alpha (Annualized) Information Ratio (Annualized)
All Three 0.769932 0.034096 0.305119
D/P Only 0.787617 0.024719 0.220838
E/P Only 0.748304 0.011722 0.173812
  1. Risk characteristics.

    • For both strategies, the market, and GMO, compute monthly VaR at \(\pi = 0.05\) (use the historical quantile).

    • The case mentions stocks under‑performed short‑term bonds from 2000–2011. Does the dynamic portfolio above under‑perform the risk‑free rate over this time?

    • Based on the regression estimates, in how many periods do we estimate a negative risk premium?

    • Do you believe the dynamic strategy takes on extra risk?

pd.concat([metrics, strat_metrics], axis=1).loc[["Annualized Return", "VaR (0.05)"], :]
GMWAX (Start-2011) GMGEX (Start-2011) GMWAX (2012-Present) GMGEX (2012-Present) GMWAX (Start-Present) GMGEX (Start-Present) All Three D/P Only E/P Only
Annualized Return 0.046422 -0.003823 0.043423 0.001311 0.045043 -0.001463 0.097024 0.089093 0.072883
VaR (0.05) -0.044003 -0.082292 -0.040854 -0.068027 -0.041368 -0.076213 -0.051822 -0.049193 -0.048823
# Dynamic portfolio
port = pd.DataFrame(
    {"All Three": all_strat, "D/P Only": dp_strat, "E/P Only": ep_strat}
).loc["2000":"2011"]
port.mean() * 12  # Annualized return over 2000-2011
All Three    0.060268
D/P Only     0.051386
E/P Only     0.026740
dtype: float64

The case mentions stocks under‑performed short‑term bonds from 2000–2011. Does the dynamic portfolio above under‑perform the risk‑free rate over this time?

No. Note that we computed this all in excess returns, so if the dynamic portfolio had underperformed the risk-free rate, it would have negative excess returns over this period. However, the dynamic portfolio has positive excess returns over this period.

Based on the regression estimates, in how many periods do we estimate a negative risk premium?

# Negative risk premium is when our forecasted return is negative.
neg_prediction = pd.DataFrame(
    {
        "All": [
            (all_prediction.dropna() < 0).sum() / len(all_prediction.dropna()),
            (all_prediction.dropna() < 0).sum(),
        ],
        "D/P": [
            (dp_prediction.dropna() < 0).sum() / len(dp_prediction.dropna()),
            (dp_prediction.dropna() < 0).sum(),
        ],
        "E/P": [
            (ep_prediction.dropna() < 0).sum() / len(ep_prediction.dropna()),
            (ep_prediction.dropna() < 0).sum(),
        ],
    },
    index=["Negative Risk Premium (%)", "Negative Risk Premium Count"],
)
neg_prediction
All D/P E/P
Negative Risk Premium (%) 0.137725 0.101796 0.002994
Negative Risk Premium Count 46.000000 34.000000 1.000000
In all 3 models we rarely have a negative risk premium. This would potentially explain the high market beta we have, as we are almost always long the market.

4 Out‑of‑Sample Forecasting#

This section utilizes data in gmo_analysis_data.xlsx. Focus on using both \(DP\) and \(EP\) as signals in (1). Compute out‑of‑sample (\(OOS\)) statistics:

Procedure (rolling OOS):

  • Start at \(t=60\).

  • Estimate (1) using data through time \(t\).

  • Using the estimated parameters and \(x_t\), compute the forecast for \(t+1\): $\( \hat r^{SPY}_{t+1} \;=\; \hat \alpha^{SPY,X}_t \;+\; \big(\hat \beta^{SPY,X}_t\big)^\prime x_t \)$

  • Forecast error: \(e^{forecast}_{t+1} = r^{SPY}_{t+1} - \hat r^{SPY}_{t+1}\).

  • Move to \(t=61\) and iterate.

Also compute the null forecast and errors: $\( \bar r^{SPY}_{t+1} = \frac{1}{t}\sum_{i=1}^t r^{SPY}_i, \qquad e^{null}_{t+1} = r^{SPY}_{t+1} - \bar r^{SPY}_{t+1}. \)$

  1. Report the out‑of‑sample \(R^2\) $\( R^2_{OOS} \;\equiv\; 1 - \frac{\sum_{i=61}^T \big(e^{forecast}_i\big)^2}{\sum_{i=61}^T \big(e^{null}_i\big)^2} \)\( Did this forecasting strategy produce a positive \)R^2_{OOS}$?

from statsmodels.regression.rolling import RollingOLS


def oos_forecast(signals, asset, t=60, rolling=False, roll_exp=False, intercept=True):
    """
    Computes an out-of-sample forecast based on expanding regression periods

    signals: DataFrame containing the signals (regressors) to be used in each regression
    asset: DataFrame containing the values (returns) of the asset being predicted
    t: The minimum number of periods
    rolling: False if expanding, else enter an integer window
    roll_exp: If using rolling, indicate whether to use expanding up to the minimum periods
    intercept: Boolean indicating the inclusion of an intercept in the regressions
    """

    n = len(signals)

    if intercept:
        signals = sm.add_constant(signals)

    if t > n:
        raise ValueError("Min. periods (t) greater than number of data points")

    output = pd.DataFrame(index=signals.index, columns=["Actual", "Predicted", "Null"])

    # If expanding
    if not rolling:
        for i in range(t, n):
            y = asset.iloc[:i]
            x = signals.iloc[:i].shift()

            if intercept:
                null_pred = y.mean()

            else:
                null_pred = 0

            model = sm.OLS(y, x, missing="drop").fit()

            pred_x = signals.iloc[[i - 1]]
            pred = model.predict(pred_x)[0]

            output.iloc[i]["Actual"] = asset.iloc[i]
            output.iloc[i]["Predicted"] = pred
            output.iloc[i]["Null"] = null_pred

    # If rolling
    else:
        if rolling > n:
            raise ValueError("Rolling window greater than number of data points")

        y = asset
        x = signals.shift()

        if intercept:
            if roll_exp:
                null_pred = y.rolling(window=rolling, min_periods=0).mean().shift()
            else:
                null_pred = y.rolling(window=rolling).mean().shift()

        else:
            null_pred = 0

        # When expanding == True, there is a minimum number of observations
        # Keep ^ in mind
        model = RollingOLS(y, x, window=rolling, expanding=roll_exp).fit()

        output["Actual"] = asset
        output["Predicted"] = (model.params * signals).dropna().sum(axis=1).shift()
        output["Null"] = null_pred

    return output


def oos_r_squared(data):
    """
    Computes the out-of-sample r squared
    data: DataFrame containing actual, model-predicted, and null-predicted values
    """

    model_error = data["Actual"] - data["Predicted"]
    null_error = data["Actual"] - data["Null"]

    r2_oos = 1 - (model_error**2).sum() / (null_error**2).sum()

    return r2_oos

oos_ep_dp = oos_forecast(signals[["SPX D/P", "SPX E/P"]], retsx["SPY"], rolling=60)
oos_r_squared(oos_ep_dp)
np.float64(-0.11416969570045854)
Nope, it produced a negative $r^{2}$ out-of-sample of -0.09, this means that it fits the data worse than a naive forecast of just using the historical average.
  1. Redo 3.2 with OOS forecasts. How does the OOS strategy compare to the in‑sample version of 3.2?

# Just using EP
oos_ep = oos_forecast(signals[["SPX E/P"]], retsx["SPY"], rolling=60)
oos_dp = oos_forecast(signals[["SPX D/P"]], retsx["SPY"], rolling=60)
oos_all = oos_forecast(signals, retsx["SPY"], rolling=60)

# Build the strategy weighting in SPY
all_weight = 100 * oos_all["Predicted"]
dp_weight = 100 * oos_dp["Predicted"]
ep_weight = 100 * oos_ep["Predicted"]

# Compute the strategy returns
all_strat = all_weight * retsx["SPY"]
dp_strat = dp_weight * retsx["SPY"]
ep_strat = ep_weight * retsx["SPY"]

# Univariate risks
strat_metrics = calc_performance_metrics(
    pd.DataFrame({"All Three": all_strat, "D/P Only": dp_strat, "E/P Only": ep_strat})
).T
strat_metrics.loc[
    [
        "Annualized Return",
        "Annualized Volatility",
        "Annualized Sharpe Ratio",
        "Max Drawdown",
    ],
    :,
]
All Three D/P Only E/P Only
Annualized Return 0.02676 0.075904 0.04398
Annualized Volatility 0.354236 0.337267 0.206451
Annualized Sharpe Ratio 0.075543 0.225057 0.21303
Max Drawdown -0.956592 -0.909461 -0.689213
summary = {
    "Beta": [],
    "Alpha (Annualized)": [],
    "Information Ratio (Annualized)": [],
}

for strat in [all_strat, dp_strat, ep_strat]:
    X = sm.add_constant(retsx["SPY"])
    y = strat
    model = sm.OLS(y, X, missing="drop").fit()
    summary["Beta"].append(model.params["SPY"])
    summary["Alpha (Annualized)"].append(model.params["const"] * 12)
    residuals = model.resid
    ir = (model.params["const"] * 12) / (residuals.std() * np.sqrt(12))
    summary["Information Ratio (Annualized)"].append(ir)

summary_df = pd.DataFrame(
    summary,
    index=["All Three", "D/P Only", "E/P Only"],
)
summary_df
Beta Alpha (Annualized) Information Ratio (Annualized)
All Three -0.309545 0.052993 0.150888
D/P Only -0.028496 0.078319 0.232235
E/P Only 0.131885 0.032804 0.159621
  1. Redo 3.3 with OOS forecasts. Is the point‑in‑time version of the strategy riskier?

strat_metrics.loc[["Annualized Return", "VaR (0.05)"], :]
All Three D/P Only E/P Only
Annualized Return 0.02676 0.075904 0.04398
VaR (0.05) -0.105682 -0.072005 -0.059274
# Negative risk premium is when our forecasted return is negative.
negative_risk_prem_oos = pd.DataFrame(
    {
        "All": [
            (oos_all["Predicted"].dropna() < 0).sum()
            / len(oos_all["Predicted"].dropna()),
            (oos_all["Predicted"].dropna() < 0).sum(),
        ],
        "D/P": [
            (oos_dp["Predicted"].dropna() < 0).sum()
            / len(oos_dp["Predicted"].dropna()),
            (oos_dp["Predicted"].dropna() < 0).sum(),
        ],
        "E/P": [
            (oos_ep["Predicted"].dropna() < 0).sum()
            / len(oos_ep["Predicted"].dropna()),
            (oos_ep["Predicted"].dropna() < 0).sum(),
        ],
    },
    index=["Negative Risk Premium (%)", "Negative Risk Premium Count"],
)
negative_risk_prem_oos
All D/P E/P
Negative Risk Premium (%) 0.312727 0.258182 0.232727
Negative Risk Premium Count 86.000000 71.000000 64.000000
Yes, it does seem riskier. The VaR is higher, the max drawdown is higher, the mean is lower, and the vol is higher. Intuitively, this makes sense because we have lookahead bias on our in-sample strategy (using data from the future to forecasst in the past), meaning that the best possible model is being used in-sample, whereas out-of-sample we are using a model that was fit on past data only.

5 EXTRA: ML Forecasts#

  1. CART. Re‑do Section 3 using CART (e.g., RandomForestRegressor from sklearn.ensemble). If you want to visualize, try sklearn.tree.

  2. CART, OOS. Compute out‑of‑sample stats as in Section 4.

  3. Neural Network. Re‑do Section 3 using a neural network (e.g., MLPRegressor from sklearn.neural_network).

  4. NN & CART, OOS. Compute out‑of‑sample stats as in Section 4.