The most popular “factors” for analyzing equity returns are the three Fama-French factors (RMRF, HML and SMB). The RMRF factor is the market return minus the risk free rate, and the HML and SMB factors are created by sorting portfolios into several “value” and “size” buckets and forming long-short portfolios.

The three factors can be used to explain, though not predict, the returns for a variety of diversified portfolios. Many posts on this blog use the Fama-French 3 Factor (FF3F) model, including a tutorial on running the 3-factor regression using R.

An alternative way to construct factors is to use linear algebra to create “optimal” factors using a technique such as principal component analysis (PCA). This post will show how to construct the statistically optimal factors for the Fama-French 25 portfolios (sorted by size and value).

In my next post, I will compare these PCA factors to the Fama-French factors.

**Description of Data**

The data used for this analysis comes from the Kenneth French website. I’m using the Fama-French 25 (FF25) portfolio returns which are available in the file titled “25 Portfolios Formed on Size and Book-to-Market”. I’m using the returns from 1962 through 2012 since the pre-Compustat era portfolios have relatively few stocks.

The Fama-French factors are also available on the Kenneth French website in the file titled “Fama/French Factors”. In this post, I will use not use the Fama-French factors themselves, but I do use the factor data file to get the monthly risk-free rate.

For reference, the arithmetic average monthly returns of the FF25 portfolios are plotted for the date range used in this analysis. The Octave script to create this plot was provided in an earlier post.