Sharpe ratio. Information ratio. Jensen's alpha. Compound annual growth rate (CAGR). Average monthly or annual returns (or excess returns). These are the five major ways that people measure portfolio performance. I've already discussed these at length in a previous article. But I want now to offer something new—and better. This performance measure is simpler, more elegant, more robust, and, most importantly, more predictive than any other I've investigated. And I've investigated a lot of them.
I recently ran the following correlation study. Using Portfolio 123, I developed forty different ranking strategies, each consisting of five equally weighted factors. One, for example, ranked stocks on the past three years' accruals (the lower, the better), trailing twelve months' operating margin, price to sales, price to tangible book value, and unlevered free cash flow to enterprise value; another ranked stocks on annual EPS growth, operating income growth, free cash flow to assets, market cap (the lower, the better), and gross profit to enterprise value. For each strategy, I measured the returns of holding the fifty top-ranked stocks in the Russell 3000, rebalancing once a month with 0.5% slippage, for the following eight-year periods: December 1999 to December 2007 and August 2009 to August 2017.
I then correlated the ranks of these strategies over the two periods according to seven different measures. Here are the results. (I repeated this experiment with the S&P 1500 and the results were very similar; I also have obtained similar results with other universes and time periods.)
The measure that beats them all, hands down, is median excess return.
The median excess return is not the median of the return minus the median of the benchmark. It is the median of the strategy's excess returns over the benchmark, measured monthly (or weekly), which is usually an entirely different number. (The average excess return, on the other hand, is the same either way you measure it.)
Why is median excess return more predictive of future returns than other performance measures? It's because of two essential features. First, it's very closely tied to the benchmark, and, as I pointed out in my previous article, there is nothing more correlative between the returns of two periods than their relationship to the benchmark (best measured by beta). Second, it effectively ignores outliers--those months with huge positive or negative returns--which have a very strong impact on other performance measures, but which are relatively useless for predicting future returns.
To illustrate the advantages of median excess return over Jensen's alpha, let's take FAIRX, the Fairholme Fund, which has a very actively managed and focused approach. Here is its five-year return including dividends compared to that of SPY, the S&P 500 ETF, assuming you invested the same amount in each fund. (I'm using the monthly adjusted close as given by Yahoo Finance, though this may not match other sources.)
You'll notice that FAIRX had a loss of 46% in November 2015, followed by a 48% rise in December (due to a massive dividend payment) and another 29% rise the following October.
FAIRX's alpha, measured over the last five years, is 4.60%. See the chart below. Alpha is the intercept of the ordinary least squares (OLS) linear regression of the returns of the fund plotted against the returns of the benchmark. The intercept is at 0.375%, which annualized is 4.60%.
So according to Jensen's alpha, FAIRX performed better than SPY. But according to every other performance measure I can think of, FAIRX performed worse than SPY. FAIRX's median excess return, for example, is –0.74%, or –8.56% annualized. Looking at the first chart, which do you think is more representative of FAIRX's five-year performance?
Why is FAIRX's alpha so high? It's because of those three outliers—the 46% loss and the 48% and 29% gains—the three dots on the second chart that are way outside the center. If you eliminate those from the data, FAIRX's alpha falls to –4.51% annualized. If you eliminate only the 48% rise, the alpha falls to –14.71%. Outliers have a huge impact on alpha. On the other hand, they don't affect the median excess return much at all. Eliminating the 48% rise lowers the median excess return from –8.56% to –9.09%, as does eliminating all three outliers.
A number of alternate methods of linear regression are more robust than alpha derived from the ordinary least squares, which gives much more weight to points that are on the outer edges of the grid than to points that are closer to the center. One is least absolute deviation (LAD) regression. Instead of drawing a line that minimizes the sum of the squares of the vertical distances to the observed points, you minimize the sum of the vertical distances. To stick to the example above, the LAD alpha would be –0.56% annualized, and the slope (beta) would be 0.78, which is steeper than the OLS beta of 0.61.
Another alternate method is Theil-Sen regression. Basically, you connect every single observed point to all the other points. Then you take the median slope of all those little lines to get your beta. Lastly, to get alpha, you take the median of all the values of yi – mxi, where m is your beta. Now when m is one, the Theil-Sen alpha is the same as the median excess return. When m is less than one, the median excess return will exceed the Theil-Sen alpha, which means that aiming for the highest median excess return will favor low-beta strategies over high-beta strategies. In general, a high median excess return-based strategy does well in both bull and bear markets. Since we're probably near the end of a bull market, that's not a bad place to stand.
Calculating Theil-Sen alpha is actually what inspired me to calculate median excess returns. Most of the strategies I was testing had betas close to one, so the differences between Theil-Sen alpha and median excess returns were tiny. And the latter was a whole lot easier to compute. You see, unfortunately, both Theil-Sen and LAD regression are computationally difficult. You can calculate OLS alpha by hand or with a calculator. (You take the product of the sum of the squares of the
x values and the sum of the
y values, subtract the product of the sum of the
x values and the sum of the products of the
x and
y values, then divide all that by the difference between the number of points times the sum of the squares of the
x values and the square of the sum of the
x values. Now say that three times fast.) Calculating LAD regression involves estimating the line hundreds of times to find the best fit, and in rare cases you'll find two best fits. (Happily, you can download an Excel plug-in that does this for you
here.) Calculating Theil-Sen alpha becomes extremely cumbersome if you have lots of points; if you have a hundred pieces of data, you have to calculate the slopes of almost five thousand lines and take their median, and if you have five hundred pieces of data, you'll need to calculate a quarter of a million slopes. I normally work with thirteen-bar returns over eighteen years, measured daily, so that's over four thousand points, or eight million slopes.
Calculating median excess returns is a hell of a lot simpler than any of that. And the results are excellent. But why are they better than CAGR, the Sharpe ratio, or the information ratio?
Once again, outliers play a huge role in all three measures. Take, for example, the following chart, which shows the performance of my own investments since November 2015 (when I started evaluating stocks based on a ranking system at Portfolio 123) compared to SPY, the S&P 500 ETF.
Now look at my returns the summer before last. In June, July, and August 2016 I made 7.82%, 4.88%, and 14.69%—enormous monthly gains. Why? In part it's because I hold only fifteen to twenty stocks at a time, and one of them, Lantheus Holdings (LNTH), went from $1.94 when I bought it at the end of March to $9.36 when I sold it in early September. Now that is an outlier.
Should LNTH's crazy outperformance—a 482% rise—really be counted in full in any estimation of my future returns? Do I really deserve an alpha of 36% or a CAGR of 43%? A Sharpe ratio of 2.33? An information ratio of 1.74? Or is my median excess return—1.35% per month, or 17.4% annualized—more representative of my actual performance?
Their sensitivity to outliers isn't the only problem with these conventional measures of performance. CAGR and the Sharpe ratio are uncorrelated to any benchmark. They measure absolute returns. Since the most persistent quality of returns is their relationship to the benchmark, that seems like an absurd way of thinking about performance.
As for the information ratio, which is also based on excess monthly returns, its weaker predictive performance is in large part due to the fact that using standard deviation as a denominator in a fraction is always going to result in a grave distortion of your data. This idea of mine goes against everything you learn in basic statistics and finance classes—not only does it invalidate the Sharpe and information ratios, it invalidates t-tests and Z-scores and Sortino ratios and probably a dozen other things beloved by quants worldwide. But I have five good reasons for it.
- Unfortunately, stock market returns are not normally distributed. Here is an excellent article on how assuming a normal distribution completely messes up expectations for standard deviation.
- Measuring standard deviation relies on the distance from the mean, and the mean is never a good measure for returns. A 50% gain followed by a 33% loss brings you a total gain of 0%, but a mean of 8.5%. Do this a lot and your standard deviation will be measuring the distance from a number that is far higher than it should be.
- Standard deviation is, like alpha, an OLS measure, giving far more weight to outliers than they deserve.
- Placing standard deviation in the denominator gives it far more weight than it should have based on its persistence, which is relatively low. In other words, the correlation of the standard deviation of a strategy or fund or portfolio between two different periods is not very high—certainly not high enough to use it as the denominator in a fraction, which has a huge impact on your measure.
- You can't meaningfully apply the measure when you're dealing with certain kinds of funds with extremely low standard deviations (e.g. short-term bond funds), as the ratio blows up sky-high.
But that was a digression. To get back to median excess returns, I wanted to make sure this measure had real-life applications, so I did another experiment. I took fifty actively managed mutual funds with the same manager for the last twenty-odd years, funds like the Franklin MicroCap Value Fund (FRMCX), the RBC Small Cap Core Fund (RCSIX), and the Bridgeway Aggressive Investors 1 Fund (BRAGX). I looked at their monthly returns going back to January 1996 and performed a four-pronged eight-year rolling correlation test, comparing their 1996-2004 returns with their 2004-2012 returns, their 1998-2006 returns with their 2006-2014 returns, and so on. As a benchmark, I used the Vanguard Total Stock Market Index Fund (VTSMX). I then deleted funds whose adjusted returns had a 0.99 correlation to those of the benchmark, because including ranking results for funds like those would be irrelevant and confusing. I was left with forty funds.
Here are my results:
These results were pretty consistent whether I used thirty, forty, or fifty funds.
Here median excess returns were not as correlative as simple median returns (I attribute that to the fact that comparing a large-cap and a microcap fund to the same benchmark is bound to cause mischief). But both of them were more correlative than any other measure. Now if you want high performance, you'd probably be better off choosing the fund with the worst average yearly return than the one with the best median return. Don't look to me for an explanation of why funds that outperform in one decade usually underperform in the next and vice-versa. But the two things that are pretty consistent about their performance are beta (which had a correlation of 0.47) and median returns. LAD alpha performs pretty well too, but every other measure had a zero or negative correlation to the same measure in the next period. These measures have very low persistence.
Now is anyone besides me using median excess returns as a performance measure these days?
Well, yes and no. You see the term crop up a lot when folks are measuring the performance of a large group of funds. Vanguard, for instance, uses it in their papers to compare the median excess return of "go anywhere funds" to a benchmark, or their actively managed funds to those of other companies. But nobody is talking about it as a way to compare just one fund or strategy to another; unless I'm overlooking someone, nobody is suggesting that it replace alpha or the Sharpe ratio as a standard measure of portfolio performance.
So I guess that's my job. Go for it, ladies and gentlemen. I think you'll find your results will be significantly better than average.
My ten largest holdings right now: AVDL, CRNT, ALSK, ELMD, AMSWA, CVV, VHI, SMMT, KMDA, BBRG.
CAGR since 1/1/2016: 50%.