Exercise - Replicating Regressions#
Data#
Use the file,
data/port_decomp_example.xlsx.
The data file contains…
Return rates, \(r_t^i\), for various asset classes, (via ETFs.)
Most notable among these securities is
SPY, the return on the S&P 500. Denote this as \(r^{\spy}_t\).A separate tab gives return rates for a particular portfolio, \(r_t^p\).
1. Regression#
1.#
Estimate the regression of the portfolio return on SPY:
Specifically, report your estimates of alpha, beta, and the r-squared.
2.#
Estimate the regression of the portfolio return on SPY and on HYG, the return on high-yield corporate bonds, denoted as \(r^{\hyg}_t\):
Specifically, report your estimates of alpha, the betas, and the r-squared.
*Note that the parameters (such as \(\beta^{\spy}\)) in this multivariate model are not the same as used in the univariate model of part 1.
3.#
Calculate the series of fitted regression values, sometimes referred to as \(\hat{y}\) in standard textbooks:
Your statistical package will output these fitted values for you, or you can construct them using the estimated parameters.
What is the correlation of \(\hat{r}^p_t\) with \(r^p_t\)?
How does this compare to the r-squared of the regression in problem 2?
4.#
How do the SPY betas differ across the univariate and multivariate models? How does this relate to the correlation between \(r^{\spy}\) and \(r^{\hyg}\)?
2. Decomposing and Replicating#
1.#
The portfolio return, \(r_t^p\), is a combination of the base assets that are provided here. Use linear regression to uncover which weights were used in constructing the portfolio.
where \(\boldsymbol{r}\) denotes the vector of returns for the individual securities.
What does the regression find were the original weights?
How precise is the estimation? Consider the R-squared and t-stats.
Feel free to include an \(\alpha\) in this model, even though you know the portfolio is an exact function of the individual securities. The estimation should find \(\alpha\) of (nearly) zero.
2.#
Suppose that we want to mimic a return, EEM using the other returns. Run the following regression–but
do so only using data through the end of 2022.
where \(\boldsymbol{r}\) denotes the vector of returns for the other securities, excluding the target, EEM.
(a)#
Report the r-squared and the estimate of the vector, \(\boldsymbol{\beta}\).
(b)#
Report the t-stats of the explanatory returns. Which have absolute value greater than 2?
(c)#
Plot the returns of EEM along with the replication values.
3.#
Perhaps the replication results in the previous problem are overstated given that they estimated the parameters within a sample and then evaluated how well the result fit in the same sample. This is known as in-sample fit.
Using the estimates through 2022, (the α and βˆ from the previous problem,) calculate the out-of-sample (OOS) values of the replication, using the 2023-2024 returns, denoted \(\boldsymbol{r}_t^{\text{oos}}\):
(a)#
What is the correlation between \(\hat{r}_t^{\targ}\) and \(\boldsymbol{r}_t^{\text{oos}}\)?
(b)#
How does this compare to the r-squared from the regression above based on in-sample data, (through 2022?)