DFA and Factor Investing

DFA and Factor Investing#

HBS Case#

*Dimensional Fund Advisors, 2002 [HBS 9-203-026].#

Pages 1-5 of the case are required. Pages 6-11 get into interesting issues around trading (especially adverse selection) and tax considerations. These sections are useful for building market knowledge, but we will not cover them.*

1. READING - DFA’s Strategy#

1. Investment philosophy.#

In 100 words or less, describe DFA’s belief about how to find premium in the market.
To what degree does their strategy rely on individual equity analysis? Macroeconomic fundamentals? Efficient markets?
Are DFA’s funds active or passive?
What do DFA and others mean by a “value” stock? And a “growth” stock?

2. Challenges for DFA’s view.#

What challenge did DFA’s model see in the 1980’s?
And in the 1990’s?

3. The market.#

Exhibit 3 has data regarding a universe of 5,020 firms. How many are considered ``large cap”? What percent of the market value do they account for?
Exhibit 6 shows that the U.S. value factor (HML) has underperformed the broader U.S. equity market in 1926-2001, including every subsample except 1963-1981. So why should an investor be interested in this value factor?

2. The Factors#

DFA believes certain stocks have higher expected excess returns. In addition to the overall market equity premium, DFA believes that there is a premium attached to a size and value factor. Note that these three factors are already listed as excess returns.

Data#

Use the data found in data/dfa_analysis_data.xlsx.

Monthly excess return data for the overall equity market, $\tilde{r}^{\text{mkt}}$.
The sheet also contains data on two additional factors, SMB and HML, as well as the risk-free rate.
You do not need any of these columns for the homework. Just use the MKT column, which is excess market returns. (So no need to subtract the risk-free rate.)

Source:#

Ken French library, accessible through the pandas-datareader API.

1. The Factors#

Calculate their univariate performance statistics:

mean
volatility
Sharpe
VaR(.05)

Report these for the following three subsamples:

Beginning - 1980
1981 - 2001
2002 - End

2.#

Based on the factor statistics above, answer the following.

Does each factor have a premium (positive expected excess return) in each subsample?
Does the premium to the size factor get smaller after 1980?
Does the premium to the value factor get smaller during the 1990’s?
How have the factors performed since the time of the case, (2002-present)?

3.#

The factors are constructed in such a way as to reduce correlation between them.

Report the correlation matrix across the three factors.
Does the construction method succeed in keeping correlations small?
Does it achieve this in each subsample?

4.#

Plot the cumulative returns of the three factors.
Create plots for the 1981-2001 subsample as well as the 2002-Present subsample.

5.#

Does it appear that all three factors were valuable in 1981-2001?
And post-2001?

Would you advise DFA to continue emphasizing all three factors?

3. CAPM#

DFA believes that premia in stocks and stock portfolios is related to the three factors.

Let’s test 25 equity portfolios that span a wide range of size and value measures.

Footnote#

For more on the portfolio construction, see the description at Ken French’s data library. https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/Data_Library/tw_5_ports.html

Portfolios#

Monthly total return data on 25 equity portfolios sorted by their size-value characteristics. Denote these as $\vec{r}^{i}$, for $n=1, \ldots, 25$.

Note that while the factors were given as excess returns, the portfolios are total returns.
For this entire problem, focus on the 1981-Present subsample.

1. Summary Statistics.#

For each portfolio,

Use the Risk-Free rate column in the factors tab to convert these total returns to excess returns.
Calculate the (annualized) univariate statistics from 1.1.

2. CAPM#

The Capital Asset Pricing Model (CAPM) asserts that an asset (or portfolio’s) expected excess return is completely a function of its beta to the equity market index (SPY, or in this case, MKT.)

Specifically, it asserts that, for any excess return, $\tilde{r}^{i}$, its mean is proportional to the mean excess return of the market, $\tilde{r}^{\text{mkt}}$, where the proporitonality is the regression beta of $\tilde{r}^{i}$ on $\tilde{r}^{\text{mkt}}$.

\[ \mathbb{E}\left[\tilde{r}_{t}^{i}\right] = \beta^{i,\text{mkt}}\; \mathbb{E}\left[\tilde{r}_{t}^{\text{mkt}}\right] \]

Let’s examine whether that seems plausible.

For each of the $n=25$ test portfolios, run the CAPM time-series regression:

\[ \tilde{r}_{t}^{i} = \alpha^i + \beta^{i,\text{mkt}}\; \tilde{r}_{t}^{\text{mkt}} + \epsilon_{t}^{i} \]

So you are running 25 separate regressions, each using the $T$-sized sample of time-series data.

Report the betas and alphas for each test asset.
Report the mean-absolute-error of the CAPM: $$\text{MAE} = \frac{1}{n}\sum_{i=1}^n \left|\alpha_i\right|$$

If the CAPM were true, what would we expect of the MAE?

Report the estimated $\beta^{i,\text{mkt}}$, Treynor Ratio, $\alpha^i$, and Information Ratio for each of the $n$ regressions.
If the CAPM model were true, what would be true of the Treynor Ratios, alphas, and Information Ratios?

3. Cross-sectional Estimation#

Let’s test the CAPM directly. We already have what we need:

The dependent variable, (y): mean excess returns from each of the $n=25$ portfolios.
The regressor, (x): the market beta from each of the $n=25$ time-series regressions.

Then we can estimate the following equation:

\[ \underbrace{\mathbb{E}\left[\tilde{r}^{i}\right]}_{n\times 1\text{ data}} = \textcolor{ForestGreen}{\underbrace{\eta}_{\text{regression intercept}}} + \underbrace{{\beta}^{i,\text{mkt}};}_{n\times 1\text{ data}}~ \textcolor{ForestGreen}{\underbrace{\lambda_{\text{mkt}}}_{\text{regression estimate}}} + \textcolor{ForestGreen}{\underbrace{\upsilon}_{n\times 1\text{ residuals}}} \]

Note that

we use sample means as estimates of $\mathbb{E}\left[\tilde{r}^{i}\right]$.
this is a weird regression! The regressors are the betas from the time-series regressions we already ran!
this is a single regression, where we are combining evidence across all $n=25$ series. Thus, it is a cross-sectional regression!
the notation is trying to emphasize that the intercept is different than the time-series $\alpha$ and that the regressor coefficient is different than the time-series betas.

Report

the R-squared of this regression.
the intercept, $\eta$.
the regression coefficient, $\lambda_{\text{mkt}}$.

What would these three statistics be if the CAPM were completely accurate?

4. Conclusion#

Broadly speaking, do these results support DFA’s belieef in size and value portfolios containing premia unrelated to the market premium?

4. Extensions#

1.#

Re-do the analysis of 3.2 and 3.3, but instead of using the market return as the factor, use a new factor: the in-sample tangency portfolio of the $n=25$ portfolios.

You will not use the factor data for this problem!

Calculate $\tilde{r}^{\text{tan}}$ by solving the MV optimization of the $n$ excess returns.
Consider this to be your single factor.

Instead of testing the CAPM, you will test the tangency-factor model:

\[ \mathbb{E}\left[\tilde{r}_{t}^{i}\right] = \beta^{i,\text{tan}}\; \mathbb{E}\left[\tilde{r}_{t}^{\text{tan}}\right] \]

What do you find?

2.#

Re-do the analysis of 3.2 and 3.3, but instead of using only the MKT factor, use MKT, SMB, and HML.

(Note again that all three are already given as excess returns, so there is no need to use the risk-free rate data.)

Thus, instead of testing the CAPM, you will be testing the Fama-French 3-Factor Model.

\[ \mathbb{E}\left[\tilde{r}_{t}^{i}\right] = \beta^{i,\text{mkt}}\; \mathbb{E}\left[\tilde{r}_{t}^{\text{mkt}}\right] + \beta^{i,\text{size}}\; \mathbb{E}\left[\tilde{r}_{t}^{\text{size}}\right] + \beta^{i,\text{val}}\; \mathbb{E}\left[\tilde{r}_{t}^{\text{val}}\right] \]

3.#

We measured how well the CAPM performs by checking the MAE of the time-series alphas.

Under classic statistical assumptions, we can test the null hypothesis that the CAPM works by calculating,

\[ H = T\left[1+\left(\text{SR}_{\text{mkt}}\right)^2\right]^{-1} \boldsymbol{\alpha}'\boldsymbol{\Sigma}_\epsilon^{-1}\boldsymbol{\alpha} \]

This test statistic has a chi-squared distribution…

\[H\sim \chi^2_n\]

Note the following:

$\boldsymbol{\alpha}$ is an $n\times 1$ vector of the individual regression alphas, $\alpha^i$.
$\boldsymbol{\Sigma}_\epsilon$ is the $n\times n$ covariance matrix of the time-series of regression residuals, $\epsilon^i$, corresponding to each regression.
$\text{SR}_{\text{mkt}}$ is the Sharpe-Ratio of $\tilde{r}^{\text{mkt}}$.

The test statistic, $H$, has a chi-squared distribution with $n=25$ degrees of freedom. So under the null hypothesis of the CAPM holding, $H$ should be small, and the distribution allows us to calculate the probability of seeing such a large $H$, conditional on the CAPM being true.

Which is a stricter test: checking whether any of the $n$ values of $\alpha^i$ have a statistically significant t-test or checking whether $H$ calculated above is significant?
Conceptually, how does the test-statistic $H$ relate to checking whether $\tilde{r}^{\text{mkt}}$ spans the tangency portfolio?