Calculating the Fama-French Factor Loadings

Note: This page contains the data source links and source code used in my “Fama-French Factor Loadings for Popular ETFs” post and my “Fundamental Indexing: Up and Running for 5 Years” post. 

If you are looking for a detailed tutorial on how to run the Fama-French regressions using R, then check out my screencast here.

Data: 

The data for the Fama-French factors and the Fama-French 25 Portfolios comes from the Kenneth French website.  I removed the header information from these files, and I removed the extra data (everything except the monthly value weighted returns) from text file for the Fama-French 25 Portfolios. The data for the ETFs analyzed is downloaded automatically from Yahoo! Finance. 

Code:

The Fama-French factor loadings for the ETFs were calculated using the R script shown here.  This script can be used to calculate the Fama-French factor loadings for any mutual fund or stock by changing the ticker symbol used in the script.  Also, the date ranges can be modified by adjusting the starting and ending year/month in rows 8-11, and by modifying the starting date for the Yahoo! Finance query in line 25.

It is useful to run the regression for a broad ETF or fund (such as SPY) after changing the date range. If the dates are aligned correctly, the R-squared should be close to 0.99 for a broad index. Also, if the date specified in the Yahoo! query is prior to the earliest date available then the regression results will be misaligned, and this will show up as an error in the output or a very low R-squared.

A second script, written in Octave, is also listed below. This script is used to calculate the factor loadings and historical returns of the Fama-French 25 portfolios. This data is useful for comparing to the ETF regression results.

As a supplement to the data provided in the post, I’ve also included the detailed regression results at the bottom of this post.

R-Code

# Goal: Using data from Yahoo finance, estimate the Fama-French Factors for any security
# using monthly returns

library(tseries)

# Load FF factor returns
startyear = 2000;
startmonth = 11;
endyear = 2010;
endmonth = 10;

start = (startyear-1926.5)*12+startmonth;
stop = (endyear - 1926.5)*12+endmonth;star

ff_returns = read.table("F-F_Factors_monthly.txt")
rmrf = ff_returns[start:stop,2]/100
smb = ff_returns[start:stop,3]/100
hml = ff_returns[start:stop,4]/100
rf = ff_returns[start:stop,5]/100

# Load Fund Data
prices <- get.hist.quote("VTI", quote="Adj", start="2000-10-30", retclass="zoo")
prices <- na.locf(prices)               # Copy last traded price when NA

# To make weekly returns, you must have this incantation:
monthly.prices <- aggregate(prices, as.yearmon, tail, 1)

# Convert monthly prices to monthly returns
r <- diff(log(monthly.prices))
r1 <- exp(r)-1

# Now shift out of zoo to become an ordinary matrix --
rj <- coredata(r1)
rj <- rj[1:120]
rjrf <- rj - rf

d <- lm(rjrf ~ rmrf + smb + hml)               # FF model estimation.
print(summary(d))

Octave Code for Calculating Factor Loadings and Returns for Fama-French 25 Portfolios

clear all; % clear data from Octave
close all; % close all open plot windows

% Load Fama-French Data
ff_data = load('25_Portfolios_5x5_monthly_2.txt');
% Load FF Factor Mimicking Portfolios
ff_facts = load('F-F_Factors_monthly.txt');

% Starting point changed to January 1932 to avoid missing data
ff_data = ff_data(67:end,:);   % start after NAs end
ff_facts = ff_facts(67:end-1,:); % start after NAs end, factors had one extra sample, so used end-1

% Remove date column
r = ff_data(:,2:end);
% Remove date and risk free
ff3f = ff_facts(:,2:end);

% Prompt for User Input to get plotting range
startyear = input('Enter Starting Year between 1932 and 2010: ')
startmonth = input('Enter Starting Month 1-12: ')
endyear = input('Enter Ending Year between 1932 and 2010: ')
endmonth = input('Enter Ending Month 1-12: ')
plottitle = input('Enter Title for Plot: ','s')

% Calculate starting and ending row
start = 12*(startyear - 1932) + startmonth;
endpoint = 12*(endyear-1932) + endmonth;

% Extract Desired Data
r = r(start:endpoint,:);
ff3f = ff3f(start:endpoint,:);
rmrf = ff3f(:,1)/100;
smb = ff3f(:,2)/100;
hml = ff3f(:,3)/100;
rf = ff3f(:,4)/100;

% Run 25 Fama-French Regressions
rx = r./100 - repmat(rf,1,25);

% Run FF regressions on all portfolios
K = 3
T = size(rx,1)
X = [ones(T,1) rmrf hml smb];
b = X\rx;
e = rx-X*b;
sigma = cov(e);
u = rx-X*b;
s2 = (T-1)/(T-K-1)*var(u)';   % this is a vector of the variance of the errors

mx = inv(X'*X);
dmx = diag(mx); % we’re interested in standard errors,

% the diagonals of the covariance matrix of bs
siga = (s2*dmx(1)).^0.5;      % std err of alpha, beta
sigb = (s2*dmx(2:end)').^0.5; % s2 is a column vector of 25. dmx’ is a
                              % row vector corresponding to factors.
                              % this produces a matrix the same size as
                              % the b coefficients.
sig_beta = sigb(:,1);
sig_h = sigb(:,2);
sig_s = sigb(:,3);

R2 = 1-s2./(std(rx).^2)';

% Pull out the regression factors
ff_alpha = b(1,:);
ff_beta = b(2,:);
h = b(3,:);
s = b(4,:);

% Calculate Arithmetic Mean for each of 25 portfolios over range
arithmeans = mean(r);

% Calculate Geometric Mean for each of 25 portfolios over selected range
georeturns = r./100 + 1;
geomeans = 100*(exp(mean(log(georeturns)))-1);

% Select if Geometric or Arithmetic mean is used by adjusting comments
%meanreturns = arithmeans;  % uncomment to use arithmetic means
meanreturns = geomeans;   % uncomment to use geometric means

% Expand 5x5 data to 10x10 for use in surface plot function
returns = [meanreturns ; meanreturns];
returns = reshape(returns,10,5);
returns = [returns;returns]
returns = reshape(returns,10,10);

% beta can be used for surface plot of beta
beta_ff = [ff_beta;ff_beta]
beta_ff = reshape(beta_ff,10,5);
beta_ff = [beta_ff;beta_ff];
beta_ff = reshape(beta_ff,10,10);

% s; s_ff can be used for surface plot of size factor
s_ff = [s;s]
s_ff = reshape(s_ff,10,5);
s_ff = [s_ff;s_ff];
s_ff = reshape(s_ff,10,10);

% h; h_ff can be used for surface plot of value factor
h_ff = [h;h]
h_ff = reshape(h_ff,10,5);
h_ff = [h_ff;h_ff];
h_ff = reshape(h_ff,10,10);

% Define x and y values
x = [0 0.999 1 1.999 2 2.999 3 3.999 4 5];
y = [0 0.999 1 1.999 2 2.999 3 3.999 4 5];

% Create x-y mesh for surface plot
[xx,yy] = meshgrid(x,y);

% Generate Plot
surf(xx,yy,returns)
xlabel('Size','fontsize',20)
ylabel('Value','fontsize',20)
%zlabel('Arithmetic Average Monthly Return (%)','rotation',90,'fontsize',20)
%zlabel('Geometric Average Monthly Return (%)','rotation',90,'fontsize',20)
title(plottitle,'fontsize',36)
axis([0 5 0 5 min(0,min(meanreturns)-.1) max(2,max(meanreturns)+0.01)])

% Size Lables for corner portfolios
line([4.5 4.5],[0.5 0.5],[meanreturns(21) meanreturns(21)+0.1])
text(4.5,0.5,meanreturns(21)+0.15,'LG','horizontalalignment','center','fontsize',18)
line([4.5 4.5],[4.5 4.5],[meanreturns(25) meanreturns(25)+0.1])
text(4.5,4.5,meanreturns(25)+0.15,'LV','horizontalalignment','center','fontsize',18)
line([0.5 0.5],[4.5 4.5],[meanreturns(5) meanreturns(5)+0.1])
text(0.5,4.5,meanreturns(5)+0.15,'SV','horizontalalignment','center','fontsize',18)
line([0.5 0.5],[0.5 0.5],[meanreturns(1) meanreturns(1)+0.1])
text(0.5,0.5,meanreturns(1)+0.15,'SG','horizontalalignment','center','fontsize',18)

% ETFs
line([4.55 4.55],[1.1 1.1],[meanreturns(22) meanreturns(22)+0.15])
text(4.55,1.1,meanreturns(22)+0.18,'SPY','horizontalalignment','center','fontsize',18)

line([4.75 4.75],[1.25 1.25],[meanreturns(22) meanreturns(22)+0.1])
text(4.75,1.25,meanreturns(22)+0.15,'DIA','horizontalalignment','center','fontsize',18)

line([3.5 3.5],[0.1 0.1],[meanreturns(16) meanreturns(16)+0.1])
text(3.5,0.1,meanreturns(16)+0.15,'QQQQ','horizontalalignment','center','fontsize',18)

line([4.5 4.5],[1.5 1.5],[meanreturns(22) meanreturns(22)+0.18])
text(4.50,1.5,meanreturns(22)+0.2,'IVE*','horizontalalignment','center','fontsize',18)

line([1.6 1.6],[1.7 1.7],[meanreturns(7) meanreturns(7)+0.25])
text(1.6,1.7,meanreturns(7)+0.3,'IWM*','horizontalalignment','center','fontsize',18)

line([1.7 1.7],[3.55 3.55],[meanreturns(9) meanreturns(9)+0.1])
text(1.7,3.55,meanreturns(9)+0.15,'IWN','horizontalalignment','center','fontsize',18)

line([1.65 1.65],[1.85 1.85],[meanreturns(7) meanreturns(7)+0.1])
text(1.65,1.85,meanreturns(7)+0.15,'IJR','horizontalalignment','center','fontsize',18)

line([1.6 1.6],[2.85 2.85],[meanreturns(8) meanreturns(8)+0.1])
text(1.6,2.85,meanreturns(8)+0.15,'IJS','horizontalalignment','center','fontsize',18)

line([2.9 2.9],[4.6 4.6],[meanreturns(15) meanreturns(15)+0.1])
text(2.9,4.6,meanreturns(15)+0.15,'IYR','horizontalalignment','center','fontsize',18)

line([3.4 3.4],[1.25 1.25],[meanreturns(19) meanreturns(19)+0.1])
text(3.4,1.25,meanreturns(19)+0.15,'MDY','horizontalalignment','center','fontsize',18)

% Color range set from 0 to 1.6 rather than allowing autoscale.
% This is done for easier comparison between plots, but colors will
% max out for values above 1.6 or below 0.
% For arithmetic averages, I think a range of 0 to 2 works better
caxis([0 1.6]);
view(50, 25);
% top view
%view(270,90);
replot

Regression Result Details:

ETF Regressions (10-yr Monthly; November 2000 thru October 2010):

SPY:

              Estimate Std. Error t value Pr(>|t|)   

(Intercept) -0.0008725  0.0005321  -1.640    0.104   

rmrf         0.9620172  0.0110702  86.901  < 2e-16 ***

smb         -0.1292276  0.0200597  -6.442 2.78e-09 ***

hml          0.0120373  0.0157662   0.763    0.447   

— 

Residual standard error: 0.005643 on 116 degrees of freedom

Multiple R-squared: 0.9862,     Adjusted R-squared: 0.9858

F-statistic:  2763 on 3 and 116 DF,  p-value: < 2.2e-16

DIA:

              Estimate Std. Error t value Pr(>|t|)   

(Intercept)  0.0009774  0.0013106   0.746  0.45730   

rmrf         0.8925274  0.0272691  32.730  < 2e-16 ***

smb         -0.2162676  0.0494128  -4.377 2.65e-05 ***

hml          0.1189867  0.0388367   3.064  0.00272 **

Residual standard error: 0.0139 on 116 degrees of freedom

Multiple R-squared: 0.9074,     Adjusted R-squared: 0.905

F-statistic: 378.7 on 3 and 116 DF,  p-value: < 2.2e-16

QQQQ:

              Estimate Std. Error t value Pr(>|t|)   

(Intercept)  0.0005339  0.0021967   0.243  0.80839   

rmrf         1.3260511  0.0457057  29.013  < 2e-16 ***

smb          0.2930439  0.0828208   3.538  0.00058 ***

hml         -0.9161538  0.0650942 -14.074  < 2e-16 ***

Residual standard error: 0.0233 on 116 degrees of freedom

Multiple R-squared: 0.9196,     Adjusted R-squared: 0.9175

F-statistic: 442.3 on 3 and 116 DF,  p-value: < 2.2e-16

IVE:

              Estimate Std. Error t value Pr(>|t|)   

(Intercept) -0.0020639  0.0009668  -2.135   0.0349 * 

rmrf         0.9947619  0.0201148  49.454  < 2e-16 ***

smb         -0.0528948  0.0364488  -1.451   0.1494   

hml          0.2691649  0.0286475   9.396 6.19e-16 ***

Residual standard error: 0.01025 on 116 degrees of freedom

Multiple R-squared: 0.9601,     Adjusted R-squared: 0.9591

F-statistic: 931.3 on 3 and 116 DF,  p-value: < 2.2e-16

IWM:

              Estimate Std. Error t value Pr(>|t|)   

(Intercept) -0.0019862  0.0008238  -2.411   0.0175 * 

rmrf         0.9759772  0.0171401  56.941  < 2e-16 ***

smb          0.8246550  0.0310586  26.552  < 2e-16 ***

hml          0.1899842  0.0244109   7.783 3.25e-12 ***

Residual standard error: 0.008737 on 116 degrees of freedom

Multiple R-squared: 0.9804,     Adjusted R-squared: 0.9799

F-statistic:  1935 on 3 and 116 DF,  p-value: < 2.2e-16

IWN:

             Estimate Std. Error t value Pr(>|t|)   

(Intercept) -0.001577   0.001138  -1.386    0.169   

rmrf         0.876021   0.023686  36.984   <2e-16 ***

smb          0.756015   0.042920  17.614   <2e-16 ***

hml          0.606810   0.033734  17.988   <2e-16 ***

Residual standard error: 0.01207 on 116 degrees of freedom

Multiple R-squared: 0.9588,     Adjusted R-squared: 0.9577

F-statistic:   899 on 3 and 116 DF,  p-value: < 2.2e-16

IJR:            

(Intercept) -0.001384   0.001156  -1.197    0.234   

rmrf         0.913679   0.024063  37.971  < 2e-16 ***

smb          0.795130   0.043603  18.236  < 2e-16 ***

hml          0.301977   0.034270   8.812 1.43e-14 ***

Residual standard error: 0.01227 on 116 degrees of freedom

Multiple R-squared: 0.958,      Adjusted R-squared: 0.957

F-statistic: 882.8 on 3 and 116 DF,  p-value: < 2.2e-16

IJS:

(Intercept) -0.001711   0.001294  -1.322    0.189   

rmrf         0.914892   0.026932  33.971   <2e-16 ***

smb          0.845430   0.048802  17.324   <2e-16 ***

hml          0.483571   0.038356  12.607   <2e-16 ***

Residual standard error: 0.01373 on 116 degrees of freedom

Multiple R-squared: 0.9511,     Adjusted R-squared: 0.9499

F-statistic: 752.7 on 3 and 116 DF,  p-value: < 2.2e-16

IYR:

             Estimate Std. Error t value Pr(>|t|)   

(Intercept) 0.0007802  0.0039314   0.198   0.8430   

rmrf        0.9250019  0.0817982  11.308  < 2e-16 ***

smb         0.4062537  0.1482219   2.741   0.0071 **

hml         0.8909395  0.1164971   7.648 6.54e-12 ***

Residual standard error: 0.04169 on 116 degrees of freedom

Multiple R-squared: 0.6629,     Adjusted R-squared: 0.6542

F-statistic: 76.05 on 3 and 116 DF,  p-value: < 2.2e-16

MDY:

             Estimate Std. Error t value Pr(>|t|)   

(Intercept) 0.0009243  0.0011596   0.797    0.427   

rmrf        0.9659343  0.0241262  40.037  < 2e-16 ***

smb         0.3705806  0.0437178   8.477 8.49e-14 ***

hml         0.1572083  0.0343607   4.575 1.20e-05 ***

Residual standard error: 0.0123 on 116 degrees of freedom

Multiple R-squared: 0.9501,     Adjusted R-squared: 0.9488

F-statistic: 736.8 on 3 and 116 DF,  p-value: < 2.2e-16

Fama-French 25 Portfolios – Regression Results (Nov. 2000 – October 2010)

ALPHAGrowth234Value
Small-0.68%-0.05%0.01%0.02%0.16%
2-0.22%0.03%0.18%-0.09%-0.30%
3-0.08%0.11%0.30%0.17%0.37%
40.17%0.20%0.01%0.15%-0.09%
Large0.04%0.12%-0.11%-0.27%-0.18%
ALPHA (t-stat)Growth234Value
Small-2.41-0.280.070.160.96
2-1.420.171.36-0.67-2.06
3-0.580.711.830.881.69
41.401.280.030.75-0.43
Large0.410.86-0.69-2.06-0.68
BETAGrowth234Value
Small1.180.980.830.770.98
21.090.900.840.881.04
31.090.960.890.930.97
41.060.961.020.981.11
Large0.950.860.900.901.07
BETA (t-stat)Growth234Value
Small20.00328.26327.92724.68228.526
233.47426.72630.81331.30333.793
340.14829.18825.97522.50821.314
441.50629.26324.25723.70626.411
Large52.53330.5428.01433.05319.277
hGrowth234Value
Small-0.380.030.340.520.72
2-0.370.160.430.580.90
3-0.450.160.410.580.68
4-0.380.270.450.540.83
Large-0.280.200.300.560.60
h (t-stat)Growth234Value
Small-4.520.618.0811.6514.77
2-7.913.2510.9614.2920.67
3-11.553.478.289.8310.51
4-10.315.787.539.1713.87
Large-10.965.046.5914.367.59
sGrowth234Value
Small1.181.070.940.981.06
21.010.980.920.891.12
30.740.570.470.430.66
40.450.320.280.280.19
Large-0.19-0.12-0.02-0.10-0.23
s (t-stat)Growth234Value
Small11.1117.0917.5317.4717.08
217.1516.1218.4317.4720.16
314.969.577.615.767.98
49.635.373.663.802.49
Large-5.95-2.40-0.32-2.08-2.29
R2Growth234Value
Small0.870.950.960.960.96
20.930.930.920.910.90
30.930.940.900.870.88
40.930.950.870.870.92
Large0.940.960.870.890.79

2 Responses to “Calculating the Fama-French Factor Loadings”

  1. Can you calculate the factor loadings of RSP, I am curious by its outperformance despite holding the same stocks as SPY.

  2. Hello. I am currently studying the Fama -French model as part of my dissertation at the moment,
    I am getting great benfit from the information on this website,
    However, I have ran it some difficulty with the above Octave code.

    In line 38, the dimensions of r is differnt to the dimensions of repmat(rf, 1, 25).
    The data for matrix r comes from the4 columns of the factor file, while the data of the rd comes from the data of 5 portfoliios and their intersectation to generate 25 portfolios.

    How can I resolve this issue so the dimensions of r are the same as repmat(rf,1,25) and the code can work?

    Thank you very much,

    Patrick O’Rourke

Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)