Data Dictionary#
This document maps notebooks to their data dependencies, build files, and data characteristics.
Auto-generated from build_data/manifest.yml - Run python scripts/generate-data-dictionary.py to update.
Legend#
Source Codes:
CRSP = Center for Research in Security Prices
BB = Bloomberg Terminal
DB = Databento
FRED = Federal Reserve Economic Data
Derived = Computed from other sources
Frequency Codes:
D = Daily
W = Weekly
M = Monthly
Q = Quarterly
Snap = Point-in-time snapshot
Build Files Reference#
Build File |
Source |
Years |
Freq |
Description |
|---|---|---|---|---|
Build BB - GMO Returns |
TODO: add description |
|||
Build BB - ProShares Analysis |
TODO: add description |
|||
Build BB - SPX Stocks |
TODO: add description |
|||
Build BB - SPY Forecasting |
TODO: add description |
|||
Build PDR - Factor Pricing |
TODO: add description |
|||
Build PDR - FamaFrench |
TODO: add description |
|||
Build PDR - Momentum Portfolios |
TODO: add description |
|||
Build PDR - Value Portfolios |
TODO: add description |
|||
Build WRDS - Barnstable |
TODO: add description |
|||
Build WRDS - CRSP Market |
TODO: add description |
|||
Build WRDS - SPX Stocks |
TODO: add description |
|||
Build Yahoo - GMO |
TODO: add description |
|||
Build Yahoo - Global Indexes |
TODO: add description |
|||
Build Yahoo - Multi-Asset ETFs |
TODO: add description |
|||
Build Yahoo - PE Adjacent Funds |
TODO: add description |
|||
Build Yahoo - Risk Assets |
TODO: add description |
|||
Build Yahoo - SPY History |
TODO: add description |
|||
Build Yahoo - Sector ETFs |
TODO: add description |
|||
OLD Build BB - SPX Stocks |
TODO: add description |
|||
Process Midterm 1 |
TODO: add description |
|||
Process Midterm 2 |
TODO: add description |
Notes#
{DATE} = Variable date in YYYY-MM-DD format
{DATE_NODASH} = Variable date in YYYYMMDD format (Databento)
{CONTRACT} = Futures contract code (e.g., FVM5, TYU5)
{TAG} = Identifier tag (e.g., 3m, M2025)
Snap = Point-in-time snapshot, not a time series
Diagnostics#
These checks help keep build_data/manifest.yml authoritative and the repo tidy.
Referenced files missing from the repo#
File |
Referenced in |
Expected folder |
Manifest build file |
|---|---|---|---|
crsp_corp_fin_2013.xlsx |
4.X.9. TA Discussion - CAPM, 6.X.8. TA Review - Multi-Factor Models |
data |
|
factor_pricing_data.xlsx |
C.6.0. Smart Beta and Factor Investing |
data |
|
gmo_data.xlsx |
7.3. TA Review - Forecasting, C.7.0. GMO Forecasting |
data/ or build_data/ |
|
ltcm exhibits data.xlsx |
C.8.0. LTCM |
data/ or build_data/ |
|
proshares analysis data.xlsx |
C.2.0. ProShares Replication |
data/ or build_data/ |
|
spx_weekly_returns.xlsx |
E.1.2. Unconstrained Optimization |
data |
Data files present in data/ but not referenced by any notebook#
Data file |
Manifest build file |
|---|---|
commodity_factors.xlsx |
|
factor_pricing_data_monthly.xlsx |
|
global_index_data.xlsx |
|
gmo_returns_data.xlsx |
|
harvard_tips_exhibits.xlsx |
|
ltcm_exhibits_data.xlsx |
|
market_returns_dividend_price_ratio.xlsx |
|
midterm_1_fund_returns.xlsx |
|
midterm_1_stock_returns.xlsx |
|
proshares_analysis_data.xlsx |
|
reversal_data.xlsx |
|
spx_data_daily.xlsx |
|
spy_forecasting_data.xlsx |
Data files in data/ not covered by any manifest output pattern#
If these are used, add the appropriate output pattern(s) to
build_data/manifest.yml.barnstable_analysis_data.xlsxcommodity_factors.xlsxcrsp_market_data.xlsxdfa_analysis_data.xlsxfactor_pricing_data_monthly.xlsxfactor_pricing_data_weekly.xlsxglobal_index_data.xlsxgmo_analysis_data.xlsxgmo_returns.xlsxgmo_returns_data.xlsxgmo_returns_weekly.xlsxharvard_tips_exhibits.xlsxltcm_exhibits_data.xlsxmarket_returns_dividend_price_ratio.xlsxmidterm_1_fund_returns.xlsxmidterm_1_stock_returns.xlsxmomentum_data.xlsxmulti_asset_etf_data.xlsxport_decomp_example.xlsxprivate_equity_data.xlsxproshares_analysis_data.xlsxreversal_data.xlsxrisk_etf_data.xlsxsector_etf_data.xlsxsingle_stock_data.xlsxspx_data_daily.xlsxspx_data_weekly.xlsxspx_returns_weekly.xlsxspy_data.xlsxspy_forecasting_data.xlsx
Notebook references not covered by build_data/manifest.yml#
Reference |
Referenced in |
|---|---|
barnstable_analysis_data.xlsx |
C.3.0. Barnstable and Long-Run Risk |
crsp_corp_fin_2013.xlsx |
4.X.9. TA Discussion - CAPM, 6.X.8. TA Review - Multi-Factor Models |
crsp_market_data.xlsx |
7.2. Long-Horizon Prediction with Persistent Signals, 9.2. Tail Risk and Short-Term Capital Management |
dfa_analysis_data.xlsx |
6.X.8. TA Review - Multi-Factor Models, C.4.0. DFA and Factor Investing |
factor_pricing_data.xlsx |
C.6.0. Smart Beta and Factor Investing |
factor_pricing_data_weekly.xlsx |
E.7.2. Forecasting with LFPM’s |
factor_pricing_data_{SAMPLING}.xlsx |
E.6.1. Single-Stock Factor Pricing |
factor_pricing_data_{TAG_FREQUENCY}.xlsx |
6.X.1. Factor Models and Tangency Portfolios |
gmo_analysis_data.xlsx |
8.1. TA Review |
gmo_data.xlsx |
7.3. TA Review - Forecasting, C.7.0. GMO Forecasting |
gmo_returns.xlsx |
C.7.2. GMO Performance |
gmo_returns_weekly.xlsx |
C.7.2. GMO Performance |
ltcm exhibits data.xlsx |
C.8.0. LTCM |
momentum_data.xlsx |
6.X.9. TA Review - Momentum, C.6.1. AQR Momentum Strategies |
multi_asset_etf_data.xlsx |
5.1. Practical Optimization, C.1.0. Harvard’s Endowment |
port_decomp_example.xlsx |
E.2.1 Replicating Regressions |
private_equity_data.xlsx |
E.2.2. Decomposing PE |
proshares analysis data.xlsx |
C.2.0. ProShares Replication |
risk_etf_data.xlsx |
1.1. Risk and Return Metrics, 1.2. Optimizing Risk and Return, 1.X.2. MV Optimization via Regression, 2.1. Linear Factor Decomposition, 3.1. Value-at-Risk, 3.X.1. Coherent Risk Measures, 3.X.9. TA Discussion - VaR and Barnstable, 5.2. Managing Tail Risk, E.1.1. Risk Metrics |
sector_etf_data.xlsx |
2.2. LFD for Dimension Reduction |
single_stock_data.xlsx |
5.2. Managing Tail Risk |
spx_data_weekly.xlsx |
E.8.1. Forecasting with Fundamentals |
spx_returns_weekly.xlsx |
1.3. MV of S&P500, 2.2. LFD for Dimension Reduction, 3.X.9. TA Discussion - VaR and Barnstable, E.1.1. Risk Metrics, E.1.2. Unconstrained Optimization, E.1.3. Constrained Optimization, E.3.1. VaR of Equity Portfolio, E.4.0. Compensated Risk |
spx_returns_{SAMPLING}.xlsx |
E.6.1. Single-Stock Factor Pricing |
spx_weekly_returns.xlsx |
E.1.2. Unconstrained Optimization |
spy_data.xlsx |
9.2. Tail Risk and Short-Term Capital Management, C.8.0. LTCM |