Calculating the Fama-French Factor Loadings
Note: This page contains the data source links and source code used in my “Fama-French Factor Loadings for Popular ETFs” post and my “Fundamental Indexing: Up and Running for 5 Years” post.
If you are looking for a detailed tutorial on how to run the Fama-French regressions using R, then check out my screencast here.
Data:
The data for the Fama-French factors and the Fama-French 25 Portfolios comes from the Kenneth French website. I removed the header information from these files, and I removed the extra data (everything except the monthly value weighted returns) from text file for the Fama-French 25 Portfolios. The data for the ETFs analyzed is downloaded automatically from Yahoo! Finance.
Code:
The Fama-French factor loadings for the ETFs were calculated using the R script shown here. This script can be used to calculate the Fama-French factor loadings for any mutual fund or stock by changing the ticker symbol used in the script. Also, the date ranges can be modified by adjusting the starting and ending year/month in rows 8-11, and by modifying the starting date for the Yahoo! Finance query in line 25.
It is useful to run the regression for a broad ETF or fund (such as SPY) after changing the date range. If the dates are aligned correctly, the R-squared should be close to 0.99 for a broad index. Also, if the date specified in the Yahoo! query is prior to the earliest date available then the regression results will be misaligned, and this will show up as an error in the output or a very low R-squared.
A second script, written in Octave, is also listed below. This script is used to calculate the factor loadings and historical returns of the Fama-French 25 portfolios. This data is useful for comparing to the ETF regression results.
As a supplement to the data provided in the post, I’ve also included the detailed regression results at the bottom of this post.
R-Code
# Goal: Using data from Yahoo finance, estimate the Fama-French Factors for any security
# using monthly returns
library(tseries)
# Load FF factor returns
startyear = 2000;
startmonth = 11;
endyear = 2010;
endmonth = 10;
start = (startyear-1926.5)*12+startmonth;
stop = (endyear - 1926.5)*12+endmonth;star
ff_returns = read.table("F-F_Factors_monthly.txt")
rmrf = ff_returns[start:stop,2]/100
smb = ff_returns[start:stop,3]/100
hml = ff_returns[start:stop,4]/100
rf = ff_returns[start:stop,5]/100
# Load Fund Data
prices <- get.hist.quote("VTI", quote="Adj", start="2000-10-30", retclass="zoo")
prices <- na.locf(prices) # Copy last traded price when NA
# To make weekly returns, you must have this incantation:
monthly.prices <- aggregate(prices, as.yearmon, tail, 1)
# Convert monthly prices to monthly returns
r <- diff(log(monthly.prices))
r1 <- exp(r)-1
# Now shift out of zoo to become an ordinary matrix --
rj <- coredata(r1)
rj <- rj[1:120]
rjrf <- rj - rf
d <- lm(rjrf ~ rmrf + smb + hml) # FF model estimation.
print(summary(d))
Octave Code for Calculating Factor Loadings and Returns for Fama-French 25 Portfolios
clear all; % clear data from Octave
close all; % close all open plot windows
% Load Fama-French Data
ff_data = load('25_Portfolios_5x5_monthly_2.txt');
% Load FF Factor Mimicking Portfolios
ff_facts = load('F-F_Factors_monthly.txt');
% Starting point changed to January 1932 to avoid missing data
ff_data = ff_data(67:end,:); % start after NAs end
ff_facts = ff_facts(67:end-1,:); % start after NAs end, factors had one extra sample, so used end-1
% Remove date column
r = ff_data(:,2:end);
% Remove date and risk free
ff3f = ff_facts(:,2:end);
% Prompt for User Input to get plotting range
startyear = input('Enter Starting Year between 1932 and 2010: ')
startmonth = input('Enter Starting Month 1-12: ')
endyear = input('Enter Ending Year between 1932 and 2010: ')
endmonth = input('Enter Ending Month 1-12: ')
plottitle = input('Enter Title for Plot: ','s')
% Calculate starting and ending row
start = 12*(startyear - 1932) + startmonth;
endpoint = 12*(endyear-1932) + endmonth;
% Extract Desired Data
r = r(start:endpoint,:);
ff3f = ff3f(start:endpoint,:);
rmrf = ff3f(:,1)/100;
smb = ff3f(:,2)/100;
hml = ff3f(:,3)/100;
rf = ff3f(:,4)/100;
% Run 25 Fama-French Regressions
rx = r./100 - repmat(rf,1,25);
% Run FF regressions on all portfolios
K = 3
T = size(rx,1)
X = [ones(T,1) rmrf hml smb];
b = X\rx;
e = rx-X*b;
sigma = cov(e);
u = rx-X*b;
s2 = (T-1)/(T-K-1)*var(u)'; % this is a vector of the variance of the errors
mx = inv(X'*X);
dmx = diag(mx); % we’re interested in standard errors,
% the diagonals of the covariance matrix of bs
siga = (s2*dmx(1)).^0.5; % std err of alpha, beta
sigb = (s2*dmx(2:end)').^0.5; % s2 is a column vector of 25. dmx’ is a
% row vector corresponding to factors.
% this produces a matrix the same size as
% the b coefficients.
sig_beta = sigb(:,1);
sig_h = sigb(:,2);
sig_s = sigb(:,3);
R2 = 1-s2./(std(rx).^2)';
% Pull out the regression factors
ff_alpha = b(1,:);
ff_beta = b(2,:);
h = b(3,:);
s = b(4,:);
% Calculate Arithmetic Mean for each of 25 portfolios over range
arithmeans = mean(r);
% Calculate Geometric Mean for each of 25 portfolios over selected range
georeturns = r./100 + 1;
geomeans = 100*(exp(mean(log(georeturns)))-1);
% Select if Geometric or Arithmetic mean is used by adjusting comments
%meanreturns = arithmeans; % uncomment to use arithmetic means
meanreturns = geomeans; % uncomment to use geometric means
% Expand 5x5 data to 10x10 for use in surface plot function
returns = [meanreturns ; meanreturns];
returns = reshape(returns,10,5);
returns = [returns;returns]
returns = reshape(returns,10,10);
% beta can be used for surface plot of beta
beta_ff = [ff_beta;ff_beta]
beta_ff = reshape(beta_ff,10,5);
beta_ff = [beta_ff;beta_ff];
beta_ff = reshape(beta_ff,10,10);
% s; s_ff can be used for surface plot of size factor
s_ff = [s;s]
s_ff = reshape(s_ff,10,5);
s_ff = [s_ff;s_ff];
s_ff = reshape(s_ff,10,10);
% h; h_ff can be used for surface plot of value factor
h_ff = [h;h]
h_ff = reshape(h_ff,10,5);
h_ff = [h_ff;h_ff];
h_ff = reshape(h_ff,10,10);
% Define x and y values
x = [0 0.999 1 1.999 2 2.999 3 3.999 4 5];
y = [0 0.999 1 1.999 2 2.999 3 3.999 4 5];
% Create x-y mesh for surface plot
[xx,yy] = meshgrid(x,y);
% Generate Plot
surf(xx,yy,returns)
xlabel('Size','fontsize',20)
ylabel('Value','fontsize',20)
%zlabel('Arithmetic Average Monthly Return (%)','rotation',90,'fontsize',20)
%zlabel('Geometric Average Monthly Return (%)','rotation',90,'fontsize',20)
title(plottitle,'fontsize',36)
axis([0 5 0 5 min(0,min(meanreturns)-.1) max(2,max(meanreturns)+0.01)])
% Size Lables for corner portfolios
line([4.5 4.5],[0.5 0.5],[meanreturns(21) meanreturns(21)+0.1])
text(4.5,0.5,meanreturns(21)+0.15,'LG','horizontalalignment','center','fontsize',18)
line([4.5 4.5],[4.5 4.5],[meanreturns(25) meanreturns(25)+0.1])
text(4.5,4.5,meanreturns(25)+0.15,'LV','horizontalalignment','center','fontsize',18)
line([0.5 0.5],[4.5 4.5],[meanreturns(5) meanreturns(5)+0.1])
text(0.5,4.5,meanreturns(5)+0.15,'SV','horizontalalignment','center','fontsize',18)
line([0.5 0.5],[0.5 0.5],[meanreturns(1) meanreturns(1)+0.1])
text(0.5,0.5,meanreturns(1)+0.15,'SG','horizontalalignment','center','fontsize',18)
% ETFs
line([4.55 4.55],[1.1 1.1],[meanreturns(22) meanreturns(22)+0.15])
text(4.55,1.1,meanreturns(22)+0.18,'SPY','horizontalalignment','center','fontsize',18)
line([4.75 4.75],[1.25 1.25],[meanreturns(22) meanreturns(22)+0.1])
text(4.75,1.25,meanreturns(22)+0.15,'DIA','horizontalalignment','center','fontsize',18)
line([3.5 3.5],[0.1 0.1],[meanreturns(16) meanreturns(16)+0.1])
text(3.5,0.1,meanreturns(16)+0.15,'QQQQ','horizontalalignment','center','fontsize',18)
line([4.5 4.5],[1.5 1.5],[meanreturns(22) meanreturns(22)+0.18])
text(4.50,1.5,meanreturns(22)+0.2,'IVE*','horizontalalignment','center','fontsize',18)
line([1.6 1.6],[1.7 1.7],[meanreturns(7) meanreturns(7)+0.25])
text(1.6,1.7,meanreturns(7)+0.3,'IWM*','horizontalalignment','center','fontsize',18)
line([1.7 1.7],[3.55 3.55],[meanreturns(9) meanreturns(9)+0.1])
text(1.7,3.55,meanreturns(9)+0.15,'IWN','horizontalalignment','center','fontsize',18)
line([1.65 1.65],[1.85 1.85],[meanreturns(7) meanreturns(7)+0.1])
text(1.65,1.85,meanreturns(7)+0.15,'IJR','horizontalalignment','center','fontsize',18)
line([1.6 1.6],[2.85 2.85],[meanreturns(8) meanreturns(8)+0.1])
text(1.6,2.85,meanreturns(8)+0.15,'IJS','horizontalalignment','center','fontsize',18)
line([2.9 2.9],[4.6 4.6],[meanreturns(15) meanreturns(15)+0.1])
text(2.9,4.6,meanreturns(15)+0.15,'IYR','horizontalalignment','center','fontsize',18)
line([3.4 3.4],[1.25 1.25],[meanreturns(19) meanreturns(19)+0.1])
text(3.4,1.25,meanreturns(19)+0.15,'MDY','horizontalalignment','center','fontsize',18)
% Color range set from 0 to 1.6 rather than allowing autoscale.
% This is done for easier comparison between plots, but colors will
% max out for values above 1.6 or below 0.
% For arithmetic averages, I think a range of 0 to 2 works better
caxis([0 1.6]);
view(50, 25);
% top view
%view(270,90);
replot
Regression Result Details:
ETF Regressions (10-yr Monthly; November 2000 thru October 2010):
SPY:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.0008725 0.0005321 -1.640 0.104
rmrf 0.9620172 0.0110702 86.901 < 2e-16 ***
smb -0.1292276 0.0200597 -6.442 2.78e-09 ***
hml 0.0120373 0.0157662 0.763 0.447
—
Residual standard error: 0.005643 on 116 degrees of freedom
Multiple R-squared: 0.9862, Adjusted R-squared: 0.9858
F-statistic: 2763 on 3 and 116 DF, p-value: < 2.2e-16
DIA:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0009774 0.0013106 0.746 0.45730
rmrf 0.8925274 0.0272691 32.730 < 2e-16 ***
smb -0.2162676 0.0494128 -4.377 2.65e-05 ***
hml 0.1189867 0.0388367 3.064 0.00272 **
—
Residual standard error: 0.0139 on 116 degrees of freedom
Multiple R-squared: 0.9074, Adjusted R-squared: 0.905
F-statistic: 378.7 on 3 and 116 DF, p-value: < 2.2e-16
QQQQ:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0005339 0.0021967 0.243 0.80839
rmrf 1.3260511 0.0457057 29.013 < 2e-16 ***
smb 0.2930439 0.0828208 3.538 0.00058 ***
hml -0.9161538 0.0650942 -14.074 < 2e-16 ***
—
Residual standard error: 0.0233 on 116 degrees of freedom
Multiple R-squared: 0.9196, Adjusted R-squared: 0.9175
F-statistic: 442.3 on 3 and 116 DF, p-value: < 2.2e-16
IVE:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.0020639 0.0009668 -2.135 0.0349 *
rmrf 0.9947619 0.0201148 49.454 < 2e-16 ***
smb -0.0528948 0.0364488 -1.451 0.1494
hml 0.2691649 0.0286475 9.396 6.19e-16 ***
—
Residual standard error: 0.01025 on 116 degrees of freedom
Multiple R-squared: 0.9601, Adjusted R-squared: 0.9591
F-statistic: 931.3 on 3 and 116 DF, p-value: < 2.2e-16
IWM:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.0019862 0.0008238 -2.411 0.0175 *
rmrf 0.9759772 0.0171401 56.941 < 2e-16 ***
smb 0.8246550 0.0310586 26.552 < 2e-16 ***
hml 0.1899842 0.0244109 7.783 3.25e-12 ***
—
Residual standard error: 0.008737 on 116 degrees of freedom
Multiple R-squared: 0.9804, Adjusted R-squared: 0.9799
F-statistic: 1935 on 3 and 116 DF, p-value: < 2.2e-16
IWN:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.001577 0.001138 -1.386 0.169
rmrf 0.876021 0.023686 36.984 <2e-16 ***
smb 0.756015 0.042920 17.614 <2e-16 ***
hml 0.606810 0.033734 17.988 <2e-16 ***
—
Residual standard error: 0.01207 on 116 degrees of freedom
Multiple R-squared: 0.9588, Adjusted R-squared: 0.9577
F-statistic: 899 on 3 and 116 DF, p-value: < 2.2e-16
IJR:
(Intercept) -0.001384 0.001156 -1.197 0.234
rmrf 0.913679 0.024063 37.971 < 2e-16 ***
smb 0.795130 0.043603 18.236 < 2e-16 ***
hml 0.301977 0.034270 8.812 1.43e-14 ***
—
Residual standard error: 0.01227 on 116 degrees of freedom
Multiple R-squared: 0.958, Adjusted R-squared: 0.957
F-statistic: 882.8 on 3 and 116 DF, p-value: < 2.2e-16
IJS:
(Intercept) -0.001711 0.001294 -1.322 0.189
rmrf 0.914892 0.026932 33.971 <2e-16 ***
smb 0.845430 0.048802 17.324 <2e-16 ***
hml 0.483571 0.038356 12.607 <2e-16 ***
—
Residual standard error: 0.01373 on 116 degrees of freedom
Multiple R-squared: 0.9511, Adjusted R-squared: 0.9499
F-statistic: 752.7 on 3 and 116 DF, p-value: < 2.2e-16
IYR:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0007802 0.0039314 0.198 0.8430
rmrf 0.9250019 0.0817982 11.308 < 2e-16 ***
smb 0.4062537 0.1482219 2.741 0.0071 **
hml 0.8909395 0.1164971 7.648 6.54e-12 ***
—
Residual standard error: 0.04169 on 116 degrees of freedom
Multiple R-squared: 0.6629, Adjusted R-squared: 0.6542
F-statistic: 76.05 on 3 and 116 DF, p-value: < 2.2e-16
MDY:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0009243 0.0011596 0.797 0.427
rmrf 0.9659343 0.0241262 40.037 < 2e-16 ***
smb 0.3705806 0.0437178 8.477 8.49e-14 ***
hml 0.1572083 0.0343607 4.575 1.20e-05 ***
—
Residual standard error: 0.0123 on 116 degrees of freedom
Multiple R-squared: 0.9501, Adjusted R-squared: 0.9488
F-statistic: 736.8 on 3 and 116 DF, p-value: < 2.2e-16
Fama-French 25 Portfolios – Regression Results (Nov. 2000 – October 2010)
| ALPHA | Growth | 2 | 3 | 4 | Value |
|---|---|---|---|---|---|
| Small | -0.68% | -0.05% | 0.01% | 0.02% | 0.16% |
| 2 | -0.22% | 0.03% | 0.18% | -0.09% | -0.30% |
| 3 | -0.08% | 0.11% | 0.30% | 0.17% | 0.37% |
| 4 | 0.17% | 0.20% | 0.01% | 0.15% | -0.09% |
| Large | 0.04% | 0.12% | -0.11% | -0.27% | -0.18% |
| ALPHA (t-stat) | Growth | 2 | 3 | 4 | Value |
|---|---|---|---|---|---|
| Small | -2.41 | -0.28 | 0.07 | 0.16 | 0.96 |
| 2 | -1.42 | 0.17 | 1.36 | -0.67 | -2.06 |
| 3 | -0.58 | 0.71 | 1.83 | 0.88 | 1.69 |
| 4 | 1.40 | 1.28 | 0.03 | 0.75 | -0.43 |
| Large | 0.41 | 0.86 | -0.69 | -2.06 | -0.68 |
| BETA | Growth | 2 | 3 | 4 | Value |
|---|---|---|---|---|---|
| Small | 1.18 | 0.98 | 0.83 | 0.77 | 0.98 |
| 2 | 1.09 | 0.90 | 0.84 | 0.88 | 1.04 |
| 3 | 1.09 | 0.96 | 0.89 | 0.93 | 0.97 |
| 4 | 1.06 | 0.96 | 1.02 | 0.98 | 1.11 |
| Large | 0.95 | 0.86 | 0.90 | 0.90 | 1.07 |
| BETA (t-stat) | Growth | 2 | 3 | 4 | Value |
|---|---|---|---|---|---|
| Small | 20.003 | 28.263 | 27.927 | 24.682 | 28.526 |
| 2 | 33.474 | 26.726 | 30.813 | 31.303 | 33.793 |
| 3 | 40.148 | 29.188 | 25.975 | 22.508 | 21.314 |
| 4 | 41.506 | 29.263 | 24.257 | 23.706 | 26.411 |
| Large | 52.533 | 30.54 | 28.014 | 33.053 | 19.277 |
| h | Growth | 2 | 3 | 4 | Value |
|---|---|---|---|---|---|
| Small | -0.38 | 0.03 | 0.34 | 0.52 | 0.72 |
| 2 | -0.37 | 0.16 | 0.43 | 0.58 | 0.90 |
| 3 | -0.45 | 0.16 | 0.41 | 0.58 | 0.68 |
| 4 | -0.38 | 0.27 | 0.45 | 0.54 | 0.83 |
| Large | -0.28 | 0.20 | 0.30 | 0.56 | 0.60 |
| h (t-stat) | Growth | 2 | 3 | 4 | Value |
|---|---|---|---|---|---|
| Small | -4.52 | 0.61 | 8.08 | 11.65 | 14.77 |
| 2 | -7.91 | 3.25 | 10.96 | 14.29 | 20.67 |
| 3 | -11.55 | 3.47 | 8.28 | 9.83 | 10.51 |
| 4 | -10.31 | 5.78 | 7.53 | 9.17 | 13.87 |
| Large | -10.96 | 5.04 | 6.59 | 14.36 | 7.59 |
| s | Growth | 2 | 3 | 4 | Value |
|---|---|---|---|---|---|
| Small | 1.18 | 1.07 | 0.94 | 0.98 | 1.06 |
| 2 | 1.01 | 0.98 | 0.92 | 0.89 | 1.12 |
| 3 | 0.74 | 0.57 | 0.47 | 0.43 | 0.66 |
| 4 | 0.45 | 0.32 | 0.28 | 0.28 | 0.19 |
| Large | -0.19 | -0.12 | -0.02 | -0.10 | -0.23 |
| s (t-stat) | Growth | 2 | 3 | 4 | Value |
|---|---|---|---|---|---|
| Small | 11.11 | 17.09 | 17.53 | 17.47 | 17.08 |
| 2 | 17.15 | 16.12 | 18.43 | 17.47 | 20.16 |
| 3 | 14.96 | 9.57 | 7.61 | 5.76 | 7.98 |
| 4 | 9.63 | 5.37 | 3.66 | 3.80 | 2.49 |
| Large | -5.95 | -2.40 | -0.32 | -2.08 | -2.29 |
| R2 | Growth | 2 | 3 | 4 | Value |
|---|---|---|---|---|---|
| Small | 0.87 | 0.95 | 0.96 | 0.96 | 0.96 |
| 2 | 0.93 | 0.93 | 0.92 | 0.91 | 0.90 |
| 3 | 0.93 | 0.94 | 0.90 | 0.87 | 0.88 |
| 4 | 0.93 | 0.95 | 0.87 | 0.87 | 0.92 |
| Large | 0.94 | 0.96 | 0.87 | 0.89 | 0.79 |

Can you calculate the factor loadings of RSP, I am curious by its outperformance despite holding the same stocks as SPY.
Hello. I am currently studying the Fama -French model as part of my dissertation at the moment,
I am getting great benfit from the information on this website,
However, I have ran it some difficulty with the above Octave code.
In line 38, the dimensions of r is differnt to the dimensions of repmat(rf, 1, 25).
The data for matrix r comes from the4 columns of the factor file, while the data of the rd comes from the data of 5 portfoliios and their intersectation to generate 25 portfolios.
How can I resolve this issue so the dimensions of r are the same as repmat(rf,1,25) and the code can work?
Thank you very much,
Patrick O’Rourke