Calculating the Fama-French Factor Loadings
Note: This page contains the data source links and source code used in my “Fama-French Factor Loadings for Popular ETFs” post and my “Fundamental Indexing: Up and Running for 5 Years” post.
If you are looking for a detailed tutorial on how to run the Fama-French regressions using R, then check out my screencast here.
Data:
The data for the Fama-French factors and the Fama-French 25 Portfolios comes from the Kenneth French website. I removed the header information from these files, and I removed the extra data (everything except the monthly value weighted returns) from text file for the Fama-French 25 Portfolios. The data for the ETFs analyzed is downloaded automatically from Yahoo! Finance.
Code:
The Fama-French factor loadings for the ETFs were calculated using the R script shown here. This script can be used to calculate the Fama-French factor loadings for any mutual fund or stock by changing the ticker symbol used in the script. Also, the date ranges can be modified by adjusting the starting and ending year/month in rows 8-11, and by modifying the starting date for the Yahoo! Finance query in line 25.
It is useful to run the regression for a broad ETF or fund (such as SPY) after changing the date range. If the dates are aligned correctly, the R-squared should be close to 0.99 for a broad index. Also, if the date specified in the Yahoo! query is prior to the earliest date available then the regression results will be misaligned, and this will show up as an error in the output or a very low R-squared.
A second script, written in Octave, is also listed below. This script is used to calculate the factor loadings and historical returns of the Fama-French 25 portfolios. This data is useful for comparing to the ETF regression results.
As a supplement to the data provided in the post, I’ve also included the detailed regression results at the bottom of this post.
R-Code
# Goal: Using data from Yahoo finance, estimate the Fama-French Factors for any security # using monthly returns library(tseries) # Load FF factor returns startyear = 2000; startmonth = 11; endyear = 2010; endmonth = 10; start = (startyear-1926.5)*12+startmonth; stop = (endyear - 1926.5)*12+endmonth;star ff_returns = read.table("F-F_Factors_monthly.txt") rmrf = ff_returns[start:stop,2]/100 smb = ff_returns[start:stop,3]/100 hml = ff_returns[start:stop,4]/100 rf = ff_returns[start:stop,5]/100 # Load Fund Data prices <- get.hist.quote("VTI", quote="Adj", start="2000-10-30", retclass="zoo") prices <- na.locf(prices) # Copy last traded price when NA # To make weekly returns, you must have this incantation: monthly.prices <- aggregate(prices, as.yearmon, tail, 1) # Convert monthly prices to monthly returns r <- diff(log(monthly.prices)) r1 <- exp(r)-1 # Now shift out of zoo to become an ordinary matrix -- rj <- coredata(r1) rj <- rj[1:120] rjrf <- rj - rf d <- lm(rjrf ~ rmrf + smb + hml) # FF model estimation. print(summary(d))
Octave Code for Calculating Factor Loadings and Returns for Fama-French 25 Portfolios
clear all; % clear data from Octave close all; % close all open plot windows % Load Fama-French Data ff_data = load('25_Portfolios_5x5_monthly_2.txt'); % Load FF Factor Mimicking Portfolios ff_facts = load('F-F_Factors_monthly.txt'); % Starting point changed to January 1932 to avoid missing data ff_data = ff_data(67:end,:); % start after NAs end ff_facts = ff_facts(67:end-1,:); % start after NAs end, factors had one extra sample, so used end-1 % Remove date column r = ff_data(:,2:end); % Remove date and risk free ff3f = ff_facts(:,2:end); % Prompt for User Input to get plotting range startyear = input('Enter Starting Year between 1932 and 2010: ') startmonth = input('Enter Starting Month 1-12: ') endyear = input('Enter Ending Year between 1932 and 2010: ') endmonth = input('Enter Ending Month 1-12: ') plottitle = input('Enter Title for Plot: ','s') % Calculate starting and ending row start = 12*(startyear - 1932) + startmonth; endpoint = 12*(endyear-1932) + endmonth; % Extract Desired Data r = r(start:endpoint,:); ff3f = ff3f(start:endpoint,:); rmrf = ff3f(:,1)/100; smb = ff3f(:,2)/100; hml = ff3f(:,3)/100; rf = ff3f(:,4)/100; % Run 25 Fama-French Regressions rx = r./100 - repmat(rf,1,25); % Run FF regressions on all portfolios K = 3 T = size(rx,1) X = [ones(T,1) rmrf hml smb]; b = X\rx; e = rx-X*b; sigma = cov(e); u = rx-X*b; s2 = (T-1)/(T-K-1)*var(u)'; % this is a vector of the variance of the errors mx = inv(X'*X); dmx = diag(mx); % we’re interested in standard errors, % the diagonals of the covariance matrix of bs siga = (s2*dmx(1)).^0.5; % std err of alpha, beta sigb = (s2*dmx(2:end)').^0.5; % s2 is a column vector of 25. dmx’ is a % row vector corresponding to factors. % this produces a matrix the same size as % the b coefficients. sig_beta = sigb(:,1); sig_h = sigb(:,2); sig_s = sigb(:,3); R2 = 1-s2./(std(rx).^2)'; % Pull out the regression factors ff_alpha = b(1,:); ff_beta = b(2,:); h = b(3,:); s = b(4,:); % Calculate Arithmetic Mean for each of 25 portfolios over range arithmeans = mean(r); % Calculate Geometric Mean for each of 25 portfolios over selected range georeturns = r./100 + 1; geomeans = 100*(exp(mean(log(georeturns)))-1); % Select if Geometric or Arithmetic mean is used by adjusting comments %meanreturns = arithmeans; % uncomment to use arithmetic means meanreturns = geomeans; % uncomment to use geometric means % Expand 5x5 data to 10x10 for use in surface plot function returns = [meanreturns ; meanreturns]; returns = reshape(returns,10,5); returns = [returns;returns] returns = reshape(returns,10,10); % beta can be used for surface plot of beta beta_ff = [ff_beta;ff_beta] beta_ff = reshape(beta_ff,10,5); beta_ff = [beta_ff;beta_ff]; beta_ff = reshape(beta_ff,10,10); % s; s_ff can be used for surface plot of size factor s_ff = [s;s] s_ff = reshape(s_ff,10,5); s_ff = [s_ff;s_ff]; s_ff = reshape(s_ff,10,10); % h; h_ff can be used for surface plot of value factor h_ff = [h;h] h_ff = reshape(h_ff,10,5); h_ff = [h_ff;h_ff]; h_ff = reshape(h_ff,10,10); % Define x and y values x = [0 0.999 1 1.999 2 2.999 3 3.999 4 5]; y = [0 0.999 1 1.999 2 2.999 3 3.999 4 5]; % Create x-y mesh for surface plot [xx,yy] = meshgrid(x,y); % Generate Plot surf(xx,yy,returns) xlabel('Size','fontsize',20) ylabel('Value','fontsize',20) %zlabel('Arithmetic Average Monthly Return (%)','rotation',90,'fontsize',20) %zlabel('Geometric Average Monthly Return (%)','rotation',90,'fontsize',20) title(plottitle,'fontsize',36) axis([0 5 0 5 min(0,min(meanreturns)-.1) max(2,max(meanreturns)+0.01)]) % Size Lables for corner portfolios line([4.5 4.5],[0.5 0.5],[meanreturns(21) meanreturns(21)+0.1]) text(4.5,0.5,meanreturns(21)+0.15,'LG','horizontalalignment','center','fontsize',18) line([4.5 4.5],[4.5 4.5],[meanreturns(25) meanreturns(25)+0.1]) text(4.5,4.5,meanreturns(25)+0.15,'LV','horizontalalignment','center','fontsize',18) line([0.5 0.5],[4.5 4.5],[meanreturns(5) meanreturns(5)+0.1]) text(0.5,4.5,meanreturns(5)+0.15,'SV','horizontalalignment','center','fontsize',18) line([0.5 0.5],[0.5 0.5],[meanreturns(1) meanreturns(1)+0.1]) text(0.5,0.5,meanreturns(1)+0.15,'SG','horizontalalignment','center','fontsize',18) % ETFs line([4.55 4.55],[1.1 1.1],[meanreturns(22) meanreturns(22)+0.15]) text(4.55,1.1,meanreturns(22)+0.18,'SPY','horizontalalignment','center','fontsize',18) line([4.75 4.75],[1.25 1.25],[meanreturns(22) meanreturns(22)+0.1]) text(4.75,1.25,meanreturns(22)+0.15,'DIA','horizontalalignment','center','fontsize',18) line([3.5 3.5],[0.1 0.1],[meanreturns(16) meanreturns(16)+0.1]) text(3.5,0.1,meanreturns(16)+0.15,'QQQQ','horizontalalignment','center','fontsize',18) line([4.5 4.5],[1.5 1.5],[meanreturns(22) meanreturns(22)+0.18]) text(4.50,1.5,meanreturns(22)+0.2,'IVE*','horizontalalignment','center','fontsize',18) line([1.6 1.6],[1.7 1.7],[meanreturns(7) meanreturns(7)+0.25]) text(1.6,1.7,meanreturns(7)+0.3,'IWM*','horizontalalignment','center','fontsize',18) line([1.7 1.7],[3.55 3.55],[meanreturns(9) meanreturns(9)+0.1]) text(1.7,3.55,meanreturns(9)+0.15,'IWN','horizontalalignment','center','fontsize',18) line([1.65 1.65],[1.85 1.85],[meanreturns(7) meanreturns(7)+0.1]) text(1.65,1.85,meanreturns(7)+0.15,'IJR','horizontalalignment','center','fontsize',18) line([1.6 1.6],[2.85 2.85],[meanreturns(8) meanreturns(8)+0.1]) text(1.6,2.85,meanreturns(8)+0.15,'IJS','horizontalalignment','center','fontsize',18) line([2.9 2.9],[4.6 4.6],[meanreturns(15) meanreturns(15)+0.1]) text(2.9,4.6,meanreturns(15)+0.15,'IYR','horizontalalignment','center','fontsize',18) line([3.4 3.4],[1.25 1.25],[meanreturns(19) meanreturns(19)+0.1]) text(3.4,1.25,meanreturns(19)+0.15,'MDY','horizontalalignment','center','fontsize',18) % Color range set from 0 to 1.6 rather than allowing autoscale. % This is done for easier comparison between plots, but colors will % max out for values above 1.6 or below 0. % For arithmetic averages, I think a range of 0 to 2 works better caxis([0 1.6]); view(50, 25); % top view %view(270,90); replot
Regression Result Details:
ETF Regressions (10-yr Monthly; November 2000 thru October 2010):
SPY:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.0008725 0.0005321 -1.640 0.104
rmrf 0.9620172 0.0110702 86.901 < 2e-16 ***
smb -0.1292276 0.0200597 -6.442 2.78e-09 ***
hml 0.0120373 0.0157662 0.763 0.447
—
Residual standard error: 0.005643 on 116 degrees of freedom
Multiple R-squared: 0.9862, Adjusted R-squared: 0.9858
F-statistic: 2763 on 3 and 116 DF, p-value: < 2.2e-16
DIA:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0009774 0.0013106 0.746 0.45730
rmrf 0.8925274 0.0272691 32.730 < 2e-16 ***
smb -0.2162676 0.0494128 -4.377 2.65e-05 ***
hml 0.1189867 0.0388367 3.064 0.00272 **
—
Residual standard error: 0.0139 on 116 degrees of freedom
Multiple R-squared: 0.9074, Adjusted R-squared: 0.905
F-statistic: 378.7 on 3 and 116 DF, p-value: < 2.2e-16
QQQQ:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0005339 0.0021967 0.243 0.80839
rmrf 1.3260511 0.0457057 29.013 < 2e-16 ***
smb 0.2930439 0.0828208 3.538 0.00058 ***
hml -0.9161538 0.0650942 -14.074 < 2e-16 ***
—
Residual standard error: 0.0233 on 116 degrees of freedom
Multiple R-squared: 0.9196, Adjusted R-squared: 0.9175
F-statistic: 442.3 on 3 and 116 DF, p-value: < 2.2e-16
IVE:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.0020639 0.0009668 -2.135 0.0349 *
rmrf 0.9947619 0.0201148 49.454 < 2e-16 ***
smb -0.0528948 0.0364488 -1.451 0.1494
hml 0.2691649 0.0286475 9.396 6.19e-16 ***
—
Residual standard error: 0.01025 on 116 degrees of freedom
Multiple R-squared: 0.9601, Adjusted R-squared: 0.9591
F-statistic: 931.3 on 3 and 116 DF, p-value: < 2.2e-16
IWM:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.0019862 0.0008238 -2.411 0.0175 *
rmrf 0.9759772 0.0171401 56.941 < 2e-16 ***
smb 0.8246550 0.0310586 26.552 < 2e-16 ***
hml 0.1899842 0.0244109 7.783 3.25e-12 ***
—
Residual standard error: 0.008737 on 116 degrees of freedom
Multiple R-squared: 0.9804, Adjusted R-squared: 0.9799
F-statistic: 1935 on 3 and 116 DF, p-value: < 2.2e-16
IWN:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.001577 0.001138 -1.386 0.169
rmrf 0.876021 0.023686 36.984 <2e-16 ***
smb 0.756015 0.042920 17.614 <2e-16 ***
hml 0.606810 0.033734 17.988 <2e-16 ***
—
Residual standard error: 0.01207 on 116 degrees of freedom
Multiple R-squared: 0.9588, Adjusted R-squared: 0.9577
F-statistic: 899 on 3 and 116 DF, p-value: < 2.2e-16
IJR:
(Intercept) -0.001384 0.001156 -1.197 0.234
rmrf 0.913679 0.024063 37.971 < 2e-16 ***
smb 0.795130 0.043603 18.236 < 2e-16 ***
hml 0.301977 0.034270 8.812 1.43e-14 ***
—
Residual standard error: 0.01227 on 116 degrees of freedom
Multiple R-squared: 0.958, Adjusted R-squared: 0.957
F-statistic: 882.8 on 3 and 116 DF, p-value: < 2.2e-16
IJS:
(Intercept) -0.001711 0.001294 -1.322 0.189
rmrf 0.914892 0.026932 33.971 <2e-16 ***
smb 0.845430 0.048802 17.324 <2e-16 ***
hml 0.483571 0.038356 12.607 <2e-16 ***
—
Residual standard error: 0.01373 on 116 degrees of freedom
Multiple R-squared: 0.9511, Adjusted R-squared: 0.9499
F-statistic: 752.7 on 3 and 116 DF, p-value: < 2.2e-16
IYR:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0007802 0.0039314 0.198 0.8430
rmrf 0.9250019 0.0817982 11.308 < 2e-16 ***
smb 0.4062537 0.1482219 2.741 0.0071 **
hml 0.8909395 0.1164971 7.648 6.54e-12 ***
—
Residual standard error: 0.04169 on 116 degrees of freedom
Multiple R-squared: 0.6629, Adjusted R-squared: 0.6542
F-statistic: 76.05 on 3 and 116 DF, p-value: < 2.2e-16
MDY:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0009243 0.0011596 0.797 0.427
rmrf 0.9659343 0.0241262 40.037 < 2e-16 ***
smb 0.3705806 0.0437178 8.477 8.49e-14 ***
hml 0.1572083 0.0343607 4.575 1.20e-05 ***
—
Residual standard error: 0.0123 on 116 degrees of freedom
Multiple R-squared: 0.9501, Adjusted R-squared: 0.9488
F-statistic: 736.8 on 3 and 116 DF, p-value: < 2.2e-16
Fama-French 25 Portfolios – Regression Results (Nov. 2000 – October 2010)
ALPHA | Growth | 2 | 3 | 4 | Value |
---|---|---|---|---|---|
Small | -0.68% | -0.05% | 0.01% | 0.02% | 0.16% |
2 | -0.22% | 0.03% | 0.18% | -0.09% | -0.30% |
3 | -0.08% | 0.11% | 0.30% | 0.17% | 0.37% |
4 | 0.17% | 0.20% | 0.01% | 0.15% | -0.09% |
Large | 0.04% | 0.12% | -0.11% | -0.27% | -0.18% |
ALPHA (t-stat) | Growth | 2 | 3 | 4 | Value |
---|---|---|---|---|---|
Small | -2.41 | -0.28 | 0.07 | 0.16 | 0.96 |
2 | -1.42 | 0.17 | 1.36 | -0.67 | -2.06 |
3 | -0.58 | 0.71 | 1.83 | 0.88 | 1.69 |
4 | 1.40 | 1.28 | 0.03 | 0.75 | -0.43 |
Large | 0.41 | 0.86 | -0.69 | -2.06 | -0.68 |
BETA | Growth | 2 | 3 | 4 | Value |
---|---|---|---|---|---|
Small | 1.18 | 0.98 | 0.83 | 0.77 | 0.98 |
2 | 1.09 | 0.90 | 0.84 | 0.88 | 1.04 |
3 | 1.09 | 0.96 | 0.89 | 0.93 | 0.97 |
4 | 1.06 | 0.96 | 1.02 | 0.98 | 1.11 |
Large | 0.95 | 0.86 | 0.90 | 0.90 | 1.07 |
BETA (t-stat) | Growth | 2 | 3 | 4 | Value |
---|---|---|---|---|---|
Small | 20.003 | 28.263 | 27.927 | 24.682 | 28.526 |
2 | 33.474 | 26.726 | 30.813 | 31.303 | 33.793 |
3 | 40.148 | 29.188 | 25.975 | 22.508 | 21.314 |
4 | 41.506 | 29.263 | 24.257 | 23.706 | 26.411 |
Large | 52.533 | 30.54 | 28.014 | 33.053 | 19.277 |
h | Growth | 2 | 3 | 4 | Value |
---|---|---|---|---|---|
Small | -0.38 | 0.03 | 0.34 | 0.52 | 0.72 |
2 | -0.37 | 0.16 | 0.43 | 0.58 | 0.90 |
3 | -0.45 | 0.16 | 0.41 | 0.58 | 0.68 |
4 | -0.38 | 0.27 | 0.45 | 0.54 | 0.83 |
Large | -0.28 | 0.20 | 0.30 | 0.56 | 0.60 |
h (t-stat) | Growth | 2 | 3 | 4 | Value |
---|---|---|---|---|---|
Small | -4.52 | 0.61 | 8.08 | 11.65 | 14.77 |
2 | -7.91 | 3.25 | 10.96 | 14.29 | 20.67 |
3 | -11.55 | 3.47 | 8.28 | 9.83 | 10.51 |
4 | -10.31 | 5.78 | 7.53 | 9.17 | 13.87 |
Large | -10.96 | 5.04 | 6.59 | 14.36 | 7.59 |
s | Growth | 2 | 3 | 4 | Value |
---|---|---|---|---|---|
Small | 1.18 | 1.07 | 0.94 | 0.98 | 1.06 |
2 | 1.01 | 0.98 | 0.92 | 0.89 | 1.12 |
3 | 0.74 | 0.57 | 0.47 | 0.43 | 0.66 |
4 | 0.45 | 0.32 | 0.28 | 0.28 | 0.19 |
Large | -0.19 | -0.12 | -0.02 | -0.10 | -0.23 |
s (t-stat) | Growth | 2 | 3 | 4 | Value |
---|---|---|---|---|---|
Small | 11.11 | 17.09 | 17.53 | 17.47 | 17.08 |
2 | 17.15 | 16.12 | 18.43 | 17.47 | 20.16 |
3 | 14.96 | 9.57 | 7.61 | 5.76 | 7.98 |
4 | 9.63 | 5.37 | 3.66 | 3.80 | 2.49 |
Large | -5.95 | -2.40 | -0.32 | -2.08 | -2.29 |
R2 | Growth | 2 | 3 | 4 | Value |
---|---|---|---|---|---|
Small | 0.87 | 0.95 | 0.96 | 0.96 | 0.96 |
2 | 0.93 | 0.93 | 0.92 | 0.91 | 0.90 |
3 | 0.93 | 0.94 | 0.90 | 0.87 | 0.88 |
4 | 0.93 | 0.95 | 0.87 | 0.87 | 0.92 |
Large | 0.94 | 0.96 | 0.87 | 0.89 | 0.79 |
Can you calculate the factor loadings of RSP, I am curious by its outperformance despite holding the same stocks as SPY.
Hello. I am currently studying the Fama -French model as part of my dissertation at the moment,
I am getting great benfit from the information on this website,
However, I have ran it some difficulty with the above Octave code.
In line 38, the dimensions of r is differnt to the dimensions of repmat(rf, 1, 25).
The data for matrix r comes from the4 columns of the factor file, while the data of the rd comes from the data of 5 portfoliios and their intersectation to generate 25 portfolios.
How can I resolve this issue so the dimensions of r are the same as repmat(rf,1,25) and the code can work?
Thank you very much,
Patrick O’Rourke