Calculating the Fama-French Factor Loadings

Note: This page contains the data source links and source code used in my “Fama-French Factor Loadings for Popular ETFs” post and my “Fundamental Indexing: Up and Running for 5 Years” post.

If you are looking for a detailed tutorial on how to run the Fama-French regressions using R, then check out my screencast here.

Data:

The data for the Fama-French factors and the Fama-French 25 Portfolios comes from the Kenneth French website. I removed the header information from these files, and I removed the extra data (everything except the monthly value weighted returns) from text file for the Fama-French 25 Portfolios. The data for the ETFs analyzed is downloaded automatically from Yahoo! Finance.

Code:

The Fama-French factor loadings for the ETFs were calculated using the R script shown here. This script can be used to calculate the Fama-French factor loadings for any mutual fund or stock by changing the ticker symbol used in the script. Also, the date ranges can be modified by adjusting the starting and ending year/month in rows 8-11, and by modifying the starting date for the Yahoo! Finance query in line 25.

It is useful to run the regression for a broad ETF or fund (such as SPY) after changing the date range. If the dates are aligned correctly, the R-squared should be close to 0.99 for a broad index. Also, if the date specified in the Yahoo! query is prior to the earliest date available then the regression results will be misaligned, and this will show up as an error in the output or a very low R-squared.

A second script, written in Octave, is also listed below. This script is used to calculate the factor loadings and historical returns of the Fama-French 25 portfolios. This data is useful for comparing to the ETF regression results.

As a supplement to the data provided in the post, I’ve also included the detailed regression results at the bottom of this post.

R-Code

# Goal: Using data from Yahoo finance, estimate the Fama-French Factors for any security
# using monthly returns

library(tseries)

# Load FF factor returns
startyear = 2000;
startmonth = 11;
endyear = 2010;
endmonth = 10;

start = (startyear-1926.5)*12+startmonth;
stop = (endyear - 1926.5)*12+endmonth;star

ff_returns = read.table("F-F_Factors_monthly.txt")
rmrf = ff_returns[start:stop,2]/100
smb = ff_returns[start:stop,3]/100
hml = ff_returns[start:stop,4]/100
rf = ff_returns[start:stop,5]/100

# Load Fund Data
prices <- get.hist.quote("VTI", quote="Adj", start="2000-10-30", retclass="zoo")
prices <- na.locf(prices)               # Copy last traded price when NA

# To make weekly returns, you must have this incantation:
monthly.prices <- aggregate(prices, as.yearmon, tail, 1)

# Convert monthly prices to monthly returns
r <- diff(log(monthly.prices))
r1 <- exp(r)-1

# Now shift out of zoo to become an ordinary matrix --
rj <- coredata(r1)
rj <- rj[1:120]
rjrf <- rj - rf

d <- lm(rjrf ~ rmrf + smb + hml)               # FF model estimation.
print(summary(d))

Octave Code for Calculating Factor Loadings and Returns for Fama-French 25 Portfolios

clear all; % clear data from Octave
close all; % close all open plot windows

% Load Fama-French Data
ff_data = load('25_Portfolios_5x5_monthly_2.txt');
% Load FF Factor Mimicking Portfolios
ff_facts = load('F-F_Factors_monthly.txt');

% Starting point changed to January 1932 to avoid missing data
ff_data = ff_data(67:end,:);   % start after NAs end
ff_facts = ff_facts(67:end-1,:); % start after NAs end, factors had one extra sample, so used end-1

% Remove date column
r = ff_data(:,2:end);
% Remove date and risk free
ff3f = ff_facts(:,2:end);

% Prompt for User Input to get plotting range
startyear = input('Enter Starting Year between 1932 and 2010: ')
startmonth = input('Enter Starting Month 1-12: ')
endyear = input('Enter Ending Year between 1932 and 2010: ')
endmonth = input('Enter Ending Month 1-12: ')
plottitle = input('Enter Title for Plot: ','s')

% Calculate starting and ending row
start = 12*(startyear - 1932) + startmonth;
endpoint = 12*(endyear-1932) + endmonth;

% Extract Desired Data
r = r(start:endpoint,:);
ff3f = ff3f(start:endpoint,:);
rmrf = ff3f(:,1)/100;
smb = ff3f(:,2)/100;
hml = ff3f(:,3)/100;
rf = ff3f(:,4)/100;

% Run 25 Fama-French Regressions
rx = r./100 - repmat(rf,1,25);

% Run FF regressions on all portfolios
K = 3
T = size(rx,1)
X = [ones(T,1) rmrf hml smb];
b = X\rx;
e = rx-X*b;
sigma = cov(e);
u = rx-X*b;
s2 = (T-1)/(T-K-1)*var(u)';   % this is a vector of the variance of the errors

mx = inv(X'*X);
dmx = diag(mx); % we’re interested in standard errors,

% the diagonals of the covariance matrix of bs
siga = (s2*dmx(1)).^0.5;      % std err of alpha, beta
sigb = (s2*dmx(2:end)').^0.5; % s2 is a column vector of 25. dmx’ is a
                              % row vector corresponding to factors.
                              % this produces a matrix the same size as
                              % the b coefficients.
sig_beta = sigb(:,1);
sig_h = sigb(:,2);
sig_s = sigb(:,3);

R2 = 1-s2./(std(rx).^2)';

% Pull out the regression factors
ff_alpha = b(1,:);
ff_beta = b(2,:);
h = b(3,:);
s = b(4,:);

% Calculate Arithmetic Mean for each of 25 portfolios over range
arithmeans = mean(r);

% Calculate Geometric Mean for each of 25 portfolios over selected range
georeturns = r./100 + 1;
geomeans = 100*(exp(mean(log(georeturns)))-1);

% Select if Geometric or Arithmetic mean is used by adjusting comments
%meanreturns = arithmeans;  % uncomment to use arithmetic means
meanreturns = geomeans;   % uncomment to use geometric means

% Expand 5x5 data to 10x10 for use in surface plot function
returns = [meanreturns ; meanreturns];
returns = reshape(returns,10,5);
returns = [returns;returns]
returns = reshape(returns,10,10);

% beta can be used for surface plot of beta
beta_ff = [ff_beta;ff_beta]
beta_ff = reshape(beta_ff,10,5);
beta_ff = [beta_ff;beta_ff];
beta_ff = reshape(beta_ff,10,10);

% s; s_ff can be used for surface plot of size factor
s_ff = [s;s]
s_ff = reshape(s_ff,10,5);
s_ff = [s_ff;s_ff];
s_ff = reshape(s_ff,10,10);

% h; h_ff can be used for surface plot of value factor
h_ff = [h;h]
h_ff = reshape(h_ff,10,5);
h_ff = [h_ff;h_ff];
h_ff = reshape(h_ff,10,10);

% Define x and y values
x = [0 0.999 1 1.999 2 2.999 3 3.999 4 5];
y = [0 0.999 1 1.999 2 2.999 3 3.999 4 5];

% Create x-y mesh for surface plot
[xx,yy] = meshgrid(x,y);

% Generate Plot
surf(xx,yy,returns)
xlabel('Size','fontsize',20)
ylabel('Value','fontsize',20)
%zlabel('Arithmetic Average Monthly Return (%)','rotation',90,'fontsize',20)
%zlabel('Geometric Average Monthly Return (%)','rotation',90,'fontsize',20)
title(plottitle,'fontsize',36)
axis([0 5 0 5 min(0,min(meanreturns)-.1) max(2,max(meanreturns)+0.01)])

% Size Lables for corner portfolios
line([4.5 4.5],[0.5 0.5],[meanreturns(21) meanreturns(21)+0.1])
text(4.5,0.5,meanreturns(21)+0.15,'LG','horizontalalignment','center','fontsize',18)
line([4.5 4.5],[4.5 4.5],[meanreturns(25) meanreturns(25)+0.1])
text(4.5,4.5,meanreturns(25)+0.15,'LV','horizontalalignment','center','fontsize',18)
line([0.5 0.5],[4.5 4.5],[meanreturns(5) meanreturns(5)+0.1])
text(0.5,4.5,meanreturns(5)+0.15,'SV','horizontalalignment','center','fontsize',18)
line([0.5 0.5],[0.5 0.5],[meanreturns(1) meanreturns(1)+0.1])
text(0.5,0.5,meanreturns(1)+0.15,'SG','horizontalalignment','center','fontsize',18)

% ETFs
line([4.55 4.55],[1.1 1.1],[meanreturns(22) meanreturns(22)+0.15])
text(4.55,1.1,meanreturns(22)+0.18,'SPY','horizontalalignment','center','fontsize',18)

line([4.75 4.75],[1.25 1.25],[meanreturns(22) meanreturns(22)+0.1])
text(4.75,1.25,meanreturns(22)+0.15,'DIA','horizontalalignment','center','fontsize',18)

line([3.5 3.5],[0.1 0.1],[meanreturns(16) meanreturns(16)+0.1])
text(3.5,0.1,meanreturns(16)+0.15,'QQQQ','horizontalalignment','center','fontsize',18)

line([4.5 4.5],[1.5 1.5],[meanreturns(22) meanreturns(22)+0.18])
text(4.50,1.5,meanreturns(22)+0.2,'IVE*','horizontalalignment','center','fontsize',18)

line([1.6 1.6],[1.7 1.7],[meanreturns(7) meanreturns(7)+0.25])
text(1.6,1.7,meanreturns(7)+0.3,'IWM*','horizontalalignment','center','fontsize',18)

line([1.7 1.7],[3.55 3.55],[meanreturns(9) meanreturns(9)+0.1])
text(1.7,3.55,meanreturns(9)+0.15,'IWN','horizontalalignment','center','fontsize',18)

line([1.65 1.65],[1.85 1.85],[meanreturns(7) meanreturns(7)+0.1])
text(1.65,1.85,meanreturns(7)+0.15,'IJR','horizontalalignment','center','fontsize',18)

line([1.6 1.6],[2.85 2.85],[meanreturns(8) meanreturns(8)+0.1])
text(1.6,2.85,meanreturns(8)+0.15,'IJS','horizontalalignment','center','fontsize',18)

line([2.9 2.9],[4.6 4.6],[meanreturns(15) meanreturns(15)+0.1])
text(2.9,4.6,meanreturns(15)+0.15,'IYR','horizontalalignment','center','fontsize',18)

line([3.4 3.4],[1.25 1.25],[meanreturns(19) meanreturns(19)+0.1])
text(3.4,1.25,meanreturns(19)+0.15,'MDY','horizontalalignment','center','fontsize',18)

% Color range set from 0 to 1.6 rather than allowing autoscale.
% This is done for easier comparison between plots, but colors will
% max out for values above 1.6 or below 0.
% For arithmetic averages, I think a range of 0 to 2 works better
caxis([0 1.6]);
view(50, 25);
% top view
%view(270,90);
replot

Regression Result Details:

ETF Regressions (10-yr Monthly; November 2000 thru October 2010):

SPY:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -0.0008725 0.0005321 -1.640 0.104

rmrf 0.9620172 0.0110702 86.901 < 2e-16 ***

smb -0.1292276 0.0200597 -6.442 2.78e-09 ***

hml 0.0120373 0.0157662 0.763 0.447

—

Residual standard error: 0.005643 on 116 degrees of freedom

Multiple R-squared: 0.9862, Adjusted R-squared: 0.9858

F-statistic: 2763 on 3 and 116 DF, p-value: < 2.2e-16

DIA:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 0.0009774 0.0013106 0.746 0.45730

rmrf 0.8925274 0.0272691 32.730 < 2e-16 ***

smb -0.2162676 0.0494128 -4.377 2.65e-05 ***

hml 0.1189867 0.0388367 3.064 0.00272 **

—

Residual standard error: 0.0139 on 116 degrees of freedom

Multiple R-squared: 0.9074, Adjusted R-squared: 0.905

F-statistic: 378.7 on 3 and 116 DF, p-value: < 2.2e-16

QQQQ:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 0.0005339 0.0021967 0.243 0.80839

rmrf 1.3260511 0.0457057 29.013 < 2e-16 ***

smb 0.2930439 0.0828208 3.538 0.00058 ***

hml -0.9161538 0.0650942 -14.074 < 2e-16 ***

—

Residual standard error: 0.0233 on 116 degrees of freedom

Multiple R-squared: 0.9196, Adjusted R-squared: 0.9175

F-statistic: 442.3 on 3 and 116 DF, p-value: < 2.2e-16

IVE:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -0.0020639 0.0009668 -2.135 0.0349 *

rmrf 0.9947619 0.0201148 49.454 < 2e-16 ***

smb -0.0528948 0.0364488 -1.451 0.1494

hml 0.2691649 0.0286475 9.396 6.19e-16 ***

—

Residual standard error: 0.01025 on 116 degrees of freedom

Multiple R-squared: 0.9601, Adjusted R-squared: 0.9591

F-statistic: 931.3 on 3 and 116 DF, p-value: < 2.2e-16

IWM:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -0.0019862 0.0008238 -2.411 0.0175 *

rmrf 0.9759772 0.0171401 56.941 < 2e-16 ***

smb 0.8246550 0.0310586 26.552 < 2e-16 ***

hml 0.1899842 0.0244109 7.783 3.25e-12 ***

—

Residual standard error: 0.008737 on 116 degrees of freedom

Multiple R-squared: 0.9804, Adjusted R-squared: 0.9799

F-statistic: 1935 on 3 and 116 DF, p-value: < 2.2e-16

IWN:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -0.001577 0.001138 -1.386 0.169

rmrf 0.876021 0.023686 36.984 <2e-16 ***

smb 0.756015 0.042920 17.614 <2e-16 ***

hml 0.606810 0.033734 17.988 <2e-16 ***

—

Residual standard error: 0.01207 on 116 degrees of freedom

Multiple R-squared: 0.9588, Adjusted R-squared: 0.9577

F-statistic: 899 on 3 and 116 DF, p-value: < 2.2e-16

IJR:

(Intercept) -0.001384 0.001156 -1.197 0.234

rmrf 0.913679 0.024063 37.971 < 2e-16 ***

smb 0.795130 0.043603 18.236 < 2e-16 ***

hml 0.301977 0.034270 8.812 1.43e-14 ***

—

Residual standard error: 0.01227 on 116 degrees of freedom

Multiple R-squared: 0.958, Adjusted R-squared: 0.957

F-statistic: 882.8 on 3 and 116 DF, p-value: < 2.2e-16

IJS:

(Intercept) -0.001711 0.001294 -1.322 0.189

rmrf 0.914892 0.026932 33.971 <2e-16 ***

smb 0.845430 0.048802 17.324 <2e-16 ***

hml 0.483571 0.038356 12.607 <2e-16 ***

—

Residual standard error: 0.01373 on 116 degrees of freedom

Multiple R-squared: 0.9511, Adjusted R-squared: 0.9499

F-statistic: 752.7 on 3 and 116 DF, p-value: < 2.2e-16

IYR:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 0.0007802 0.0039314 0.198 0.8430

rmrf 0.9250019 0.0817982 11.308 < 2e-16 ***

smb 0.4062537 0.1482219 2.741 0.0071 **

hml 0.8909395 0.1164971 7.648 6.54e-12 ***

—

Residual standard error: 0.04169 on 116 degrees of freedom

Multiple R-squared: 0.6629, Adjusted R-squared: 0.6542

F-statistic: 76.05 on 3 and 116 DF, p-value: < 2.2e-16

MDY:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 0.0009243 0.0011596 0.797 0.427

rmrf 0.9659343 0.0241262 40.037 < 2e-16 ***

smb 0.3705806 0.0437178 8.477 8.49e-14 ***

hml 0.1572083 0.0343607 4.575 1.20e-05 ***

—

Residual standard error: 0.0123 on 116 degrees of freedom

Multiple R-squared: 0.9501, Adjusted R-squared: 0.9488

F-statistic: 736.8 on 3 and 116 DF, p-value: < 2.2e-16

Fama-French 25 Portfolios – Regression Results (Nov. 2000 – October 2010)



	
		ALPHA Growth 2 3 4 Value
	


	
		Small -0.68% -0.05% 0.01% 0.02% 0.16%
	
	
		2 -0.22% 0.03% 0.18% -0.09% -0.30%
	
	
		3 -0.08% 0.11% 0.30% 0.17% 0.37%
	
	
		4 0.17% 0.20% 0.01% 0.15% -0.09%
	
	
		Large 0.04% 0.12% -0.11% -0.27% -0.18%
	







	
		ALPHA (t-stat) Growth 2 3 4 Value
	


	
		Small -2.41 -0.28 0.07 0.16 0.96
	
	
		2 -1.42 0.17 1.36 -0.67 -2.06
	
	
		3 -0.58 0.71 1.83 0.88 1.69
	
	
		4 1.40 1.28 0.03 0.75 -0.43
	
	
		Large 0.41 0.86 -0.69 -2.06 -0.68
	







	
		BETA Growth 2 3 4 Value
	


	
		Small 1.18 0.98 0.83 0.77 0.98
	
	
		2 1.09 0.90 0.84 0.88 1.04
	
	
		3 1.09 0.96 0.89 0.93 0.97
	
	
		4 1.06 0.96 1.02 0.98 1.11
	
	
		Large 0.95 0.86 0.90 0.90 1.07
	







	
		BETA (t-stat) Growth 2 3 4 Value
	


	
		Small 20.003 28.263 27.927 24.682 28.526
	
	
		2 33.474 26.726 30.813 31.303 33.793
	
	
		3 40.148 29.188 25.975 22.508 21.314
	
	
		4 41.506 29.263 24.257 23.706 26.411
	
	
		Large 52.533 30.54 28.014 33.053 19.277
	







	
		h Growth 2 3 4 Value
	


	
		Small -0.38 0.03 0.34 0.52 0.72
	
	
		2 -0.37 0.16 0.43 0.58 0.90
	
	
		3 -0.45 0.16 0.41 0.58 0.68
	
	
		4 -0.38 0.27 0.45 0.54 0.83
	
	
		Large -0.28 0.20 0.30 0.56 0.60
	







	
		h (t-stat) Growth 2 3 4 Value
	


	
		Small -4.52 0.61 8.08 11.65 14.77
	
	
		2 -7.91 3.25 10.96 14.29 20.67
	
	
		3 -11.55 3.47 8.28 9.83 10.51
	
	
		4 -10.31 5.78 7.53 9.17 13.87
	
	
		Large -10.96 5.04 6.59 14.36 7.59
	







	
		s Growth 2 3 4 Value
	


	
		Small 1.18 1.07 0.94 0.98 1.06
	
	
		2 1.01 0.98 0.92 0.89 1.12
	
	
		3 0.74 0.57 0.47 0.43 0.66
	
	
		4 0.45 0.32 0.28 0.28 0.19
	
	
		Large -0.19 -0.12 -0.02 -0.10 -0.23
	







	
		s (t-stat) Growth 2 3 4 Value
	


	
		Small 11.11 17.09 17.53 17.47 17.08
	
	
		2 17.15 16.12 18.43 17.47 20.16
	
	
		3 14.96 9.57 7.61 5.76 7.98
	
	
		4 9.63 5.37 3.66 3.80 2.49
	
	
		Large -5.95 -2.40 -0.32 -2.08 -2.29
	







	
		R2 Growth 2 3 4 Value
	


	
		Small 0.87 0.95 0.96 0.96 0.96
	
	
		2 0.93 0.93 0.92 0.91 0.90
	
	
		3 0.93 0.94 0.90 0.87 0.88
	
	
		4 0.93 0.95 0.87 0.87 0.92
	
	
		Large 0.94 0.96 0.87 0.89 0.79

ALPHA	Growth	2	3	4	Value
Small	-0.68%	-0.05%	0.01%	0.02%	0.16%
2	-0.22%	0.03%	0.18%	-0.09%	-0.30%
3	-0.08%	0.11%	0.30%	0.17%	0.37%
4	0.17%	0.20%	0.01%	0.15%	-0.09%
Large	0.04%	0.12%	-0.11%	-0.27%	-0.18%

ALPHA (t-stat)	Growth	2	3	4	Value
Small	-2.41	-0.28	0.07	0.16	0.96
2	-1.42	0.17	1.36	-0.67	-2.06
3	-0.58	0.71	1.83	0.88	1.69
4	1.40	1.28	0.03	0.75	-0.43
Large	0.41	0.86	-0.69	-2.06	-0.68

BETA	Growth	2	3	4	Value
Small	1.18	0.98	0.83	0.77	0.98
2	1.09	0.90	0.84	0.88	1.04
3	1.09	0.96	0.89	0.93	0.97
4	1.06	0.96	1.02	0.98	1.11
Large	0.95	0.86	0.90	0.90	1.07

BETA (t-stat)	Growth	2	3	4	Value
Small	20.003	28.263	27.927	24.682	28.526
2	33.474	26.726	30.813	31.303	33.793
3	40.148	29.188	25.975	22.508	21.314
4	41.506	29.263	24.257	23.706	26.411
Large	52.533	30.54	28.014	33.053	19.277

h	Growth	2	3	4	Value
Small	-0.38	0.03	0.34	0.52	0.72
2	-0.37	0.16	0.43	0.58	0.90
3	-0.45	0.16	0.41	0.58	0.68
4	-0.38	0.27	0.45	0.54	0.83
Large	-0.28	0.20	0.30	0.56	0.60

h (t-stat)	Growth	2	3	4	Value
Small	-4.52	0.61	8.08	11.65	14.77
2	-7.91	3.25	10.96	14.29	20.67
3	-11.55	3.47	8.28	9.83	10.51
4	-10.31	5.78	7.53	9.17	13.87
Large	-10.96	5.04	6.59	14.36	7.59

s	Growth	2	3	4	Value
Small	1.18	1.07	0.94	0.98	1.06
2	1.01	0.98	0.92	0.89	1.12
3	0.74	0.57	0.47	0.43	0.66
4	0.45	0.32	0.28	0.28	0.19
Large	-0.19	-0.12	-0.02	-0.10	-0.23

s (t-stat)	Growth	2	3	4	Value
Small	11.11	17.09	17.53	17.47	17.08
2	17.15	16.12	18.43	17.47	20.16
3	14.96	9.57	7.61	5.76	7.98
4	9.63	5.37	3.66	3.80	2.49
Large	-5.95	-2.40	-0.32	-2.08	-2.29

R2	Growth	2	3	4	Value
Small	0.87	0.95	0.96	0.96	0.96
2	0.93	0.93	0.92	0.91	0.90
3	0.93	0.94	0.90	0.87	0.88
4	0.93	0.95	0.87	0.87	0.92
Large	0.94	0.96	0.87	0.89	0.79

Posted by calcinv at 1:04 pm
Add comments

2 Responses to “Calculating the Fama-French Factor Loadings”

The Investment Scientist says:

November 20, 2013 at 9:08 am

Can you calculate the factor loadings of RSP, I am curious by its outperformance despite holding the same stocks as SPY.

Reply
Patrick O'Rourke says:

October 17, 2015 at 9:25 am

Hello. I am currently studying the Fama -French model as part of my dissertation at the moment,
I am getting great benfit from the information on this website,
However, I have ran it some difficulty with the above Octave code.

In line 38, the dimensions of r is differnt to the dimensions of repmat(rf, 1, 25).
The data for matrix r comes from the4 columns of the factor file, while the data of the rd comes from the data of 5 portfoliios and their intersectation to generate 25 portfolios.

How can I resolve this issue so the dimensions of r are the same as repmat(rf,1,25) and the code can work?

Thank you very much,

Patrick O’Rourke

Reply

Calculating the Fama-French Factor Loadings

2 Responses to “Calculating the Fama-French Factor Loadings”

Leave a Reply Cancel reply

Finance and Economics

Financial Data

Other