Sharpe ratio. Information ratio. Jensen's alpha. Compound annual growth rate (CAGR). Average monthly or annual returns (or excess returns). These are the five major ways that people measure portfolio performance. I've already discussed these at length in a previous article. But I want now to offer something new—and better. This performance measure is simpler, more elegant, more robust, and, most importantly, more predictive than any other I've investigated. And I've investigated a lot of them.
I recently ran the following correlation study. Using Portfolio 123, I developed forty different ranking strategies, each consisting of five equally weighted factors. One, for example, ranked stocks on the past three years' accruals (the lower, the better), trailing twelve months' operating margin, price to sales, price to tangible book value, and unlevered free cash flow to enterprise value; another ranked stocks on annual EPS growth, operating income growth, free cash flow to assets, market cap (the lower, the better), and gross profit to enterprise value. For each strategy, I measured the returns of holding the fifty top-ranked stocks in the Russell 3000, rebalancing once a month with 0.5% slippage, for the following eight-year periods: December 1999 to December 2007 and August 2009 to August 2017.
I then correlated the ranks of these strategies over the two periods according to seven different measures. Here are the results. (I repeated this experiment with the S&P 1500 and the results were very similar; I also have obtained similar results with other universes and time periods.)
The measure that beats them all, hands down, is median excess return.
Why is median excess return more predictive of future returns than other performance measures? It's because of two essential features. First, it's very closely tied to the benchmark, and, as I pointed out in my previous article, there is nothing more correlative between the returns of two periods than their relationship to the benchmark (best measured by beta). Second, it effectively ignores outliers--those months with huge positive or negative returns--which have a very strong impact on other performance measures, but which are relatively useless for predicting future returns.
To illustrate the advantages of median excess return over Jensen's alpha, let's take FAIRX, the Fairholme Fund, which has a very actively managed and focused approach. Here is its five-year return including dividends compared to that of SPY, the S&P 500 ETF, assuming you invested the same amount in each fund. (I'm using the monthly adjusted close as given by Yahoo Finance, though this may not match other sources.)
You'll notice that FAIRX had a loss of 46% in November 2015, followed by a 48% rise in December (due to a massive dividend payment) and another 29% rise the following October.
FAIRX's alpha, measured over the last five years, is 4.60%. See the chart below. Alpha is the intercept of the ordinary least squares (OLS) linear regression of the returns of the fund plotted against the returns of the benchmark. The intercept is at 0.375%, which annualized is 4.60%.
So according to Jensen's alpha, FAIRX performed better than SPY. But according to every other performance measure I can think of, FAIRX performed worse than SPY. FAIRX's median excess return, for example, is –0.74%, or –8.56% annualized. Looking at the first chart, which do you think is more representative of FAIRX's five-year performance?
Why is FAIRX's alpha so high? It's because of those three outliers—the 46% loss and the 48% and 29% gains—the three dots on the second chart that are way outside the center. If you eliminate those from the data, FAIRX's alpha falls to –4.51% annualized. If you eliminate only the 48% rise, the alpha falls to –14.71%. Outliers have a huge impact on alpha. On the other hand, they don't affect the median excess return much at all. Eliminating the 48% rise lowers the median excess return from –8.56% to –9.09%, as does eliminating all three outliers.
A number of alternate methods of linear regression are more robust than alpha derived from the ordinary least squares, which gives much more weight to points that are on the outer edges of the grid than to points that are closer to the center. One is least absolute deviation (LAD) regression. Instead of drawing a line that minimizes the sum of the squares of the vertical distances to the observed points, you minimize the sum of the vertical distances. To stick to the example above, the LAD alpha would be –0.56% annualized, and the slope (beta) would be 0.78, which is steeper than the OLS beta of 0.61.
Another alternate method is Theil-Sen regression. Basically, you connect every single observed point to all the other points. Then you take the median slope of all those little lines to get your beta. Lastly, to get alpha, you take the median of all the values of yi – mxi, where m is your beta. Now when m is one, the Theil-Sen alpha is the same as the median excess return. When m is less than one, the median excess return will exceed the Theil-Sen alpha, which means that aiming for the highest median excess return will favor low-beta strategies over high-beta strategies. In general, a high median excess return-based strategy does well in both bull and bear markets. Since we're probably near the end of a bull market, that's not a bad place to stand.
Calculating median excess returns is a hell of a lot simpler than any of that. And the results are excellent. But why are they better than CAGR, the Sharpe ratio, or the information ratio?
Once again, outliers play a huge role in all three measures. Take, for example, the following chart, which shows the performance of my own investments since November 2015 (when I started evaluating stocks based on a ranking system at Portfolio 123) compared to SPY, the S&P 500 ETF.
Now look at my returns the summer before last. In June, July, and August 2016 I made 7.82%, 4.88%, and 14.69%—enormous monthly gains. Why? In part it's because I hold only fifteen to twenty stocks at a time, and one of them, Lantheus Holdings (LNTH), went from $1.94 when I bought it at the end of March to $9.36 when I sold it in early September. Now that is an outlier.
Should LNTH's crazy outperformance—a 482% rise—really be counted in full in any estimation of my future returns? Do I really deserve an alpha of 36% or a CAGR of 43%? A Sharpe ratio of 2.33? An information ratio of 1.74? Or is my median excess return—1.35% per month, or 17.4% annualized—more representative of my actual performance?
Their sensitivity to outliers isn't the only problem with these conventional measures of performance. CAGR and the Sharpe ratio are uncorrelated to any benchmark. They measure absolute returns. Since the most persistent quality of returns is their relationship to the benchmark, that seems like an absurd way of thinking about performance.
As for the information ratio, which is also based on excess monthly returns, its weaker predictive performance is in large part due to the fact that using standard deviation as a denominator in a fraction is always going to result in a grave distortion of your data. This idea of mine goes against everything you learn in basic statistics and finance classes—not only does it invalidate the Sharpe and information ratios, it invalidates t-tests and Z-scores and Sortino ratios and probably a dozen other things beloved by quants worldwide. But I have five good reasons for it.
- Unfortunately, stock market returns are not normally distributed. Here is an excellent article on how assuming a normal distribution completely messes up expectations for standard deviation.
- Measuring standard deviation relies on the distance from the mean, and the mean is never a good measure for returns. A 50% gain followed by a 33% loss brings you a total gain of 0%, but a mean of 8.5%. Do this a lot and your standard deviation will be measuring the distance from a number that is far higher than it should be.
- Standard deviation is, like alpha, an OLS measure, giving far more weight to outliers than they deserve.
- Placing standard deviation in the denominator gives it far more weight than it should have based on its persistence, which is relatively low. In other words, the correlation of the standard deviation of a strategy or fund or portfolio between two different periods is not very high—certainly not high enough to use it as the denominator in a fraction, which has a huge impact on your measure.
- You can't meaningfully apply the measure when you're dealing with certain kinds of funds with extremely low standard deviations (e.g. short-term bond funds), as the ratio blows up sky-high.
Here are my results:
These results were pretty consistent whether I used thirty, forty, or fifty funds.
Here median excess returns were not as correlative as simple median returns (I attribute that to the fact that comparing a large-cap and a microcap fund to the same benchmark is bound to cause mischief). But both of them were more correlative than any other measure. Now if you want high performance, you'd probably be better off choosing the fund with the worst average yearly return than the one with the best median return. Don't look to me for an explanation of why funds that outperform in one decade usually underperform in the next and vice-versa. But the two things that are pretty consistent about their performance are beta (which had a correlation of 0.47) and median returns. LAD alpha performs pretty well too, but every other measure had a zero or negative correlation to the same measure in the next period. These measures have very low persistence.
Well, yes and no. You see the term crop up a lot when folks are measuring the performance of a large group of funds. Vanguard, for instance, uses it in their papers to compare the median excess return of "go anywhere funds" to a benchmark, or their actively managed funds to those of other companies. But nobody is talking about it as a way to compare just one fund or strategy to another; unless I'm overlooking someone, nobody is suggesting that it replace alpha or the Sharpe ratio as a standard measure of portfolio performance.
So I guess that's my job. Go for it, ladies and gentlemen. I think you'll find your results will be significantly better than average.
My ten largest holdings right now: AVDL, CRNT, ALSK, ELMD, AMSWA, CVV, VHI, SMMT, KMDA, BBRG.
CAGR since 1/1/2016: 50%.
Hi Tylor,
While I've been binge-reading your blog over the past couple of days and enjoying it greatly, I see an issue with this claim:
"A 50% gain followed by a 33% loss brings you a total gain of 0%, but a mean of 8.5%" - how so? The adequate mean in this case is the geometric mean and it has to be calculated not on the raw data but by converting it to growth percentages/proportions first due to the fact geometric means cannot be computed with negative numbers. So 50% becomes 150% and -33% becomes 67% The result is then a geometric mean of 0.249688278817% which exactly matches the average return after two periods wherein you gain 50% and then lose 33% ($100 x 1.00249688278817 x 1.00249688278817 = $100.4999999999998). Here is a calculator I've coded a few years ago which handles this scenario nicely and is afaik the only publicly available one which does so (e.g. Excel flat out refuses to compute it): https://www.gigacalculator.com/calculators/geometric-mean-calculator.php?numbers=50%25+-33%25
I'll have to think if this changes anything with your argument made using the example or if it is just a poor example.
Posted by: Georgi | 04/13/2020 at 04:11 AM
P.S. Apologies for using your family name Yuval, I just got so used to typing 'Tylor' in front of the many posts I've saved for future reference :-)
Posted by: Georgi | 04/13/2020 at 04:12 AM
Georgi: your math is a bit off. The arithmetic mean of 50% and -33% is 8.5% and the geometric mean is 0% ((1.5 x 0.67)^0.5 - 1). Excel does caculate geometric mean with the GEOMEAN command.
Posted by: Yuval Taylor | 04/13/2020 at 01:30 PM
Definitely not off. $100 x 50% x -33% = $150 x -33% = $100.5. The arithmetic mean does not apply to ratios, it is the geometric mean that you need to use in any such case.
Posted by: Georgi | 04/13/2020 at 09:43 PM
Definitely not off. 50% growth followed by a 33% loss is $100 x 150% x -33% = $150 x -33% = $100.5. The arithmetic mean does not apply to ratios, it is the geometric mean that you need to use in any such case.
Posted by: Georgi | 04/13/2020 at 09:45 PM
And here is proof that Excel (latest version part of Office 365) doesn't compute geomean with negative values, percentages or proportions: https://imgur.com/a/RRXrOCS
Posted by: Georgi | 04/13/2020 at 09:49 PM
FYI using the geomean calculation you correctly provided (((1.5 x 0.67)^0.5 - 1)) results in 0.25%, not 0%: https://imgur.com/a/6wL1O9u . Sorry for the multiple comments, but there is no "edit" feature for these comments on my side.
Posted by: Georgi | 04/13/2020 at 10:17 PM
Final note. I had to come back to this since initially looking at the formula you provided I thought it is the correct formula as given in multiple sources and should usually do the job. Comparing your result of 0.250% and mine of 0.249688278871% I thought the difference, though large, was just due to a rounding error. But then I used your result to check the return and with 0.250% it actually comes out to $100.500625 (($100 x 100.25%) x 100.25%) which is a definite overestimation of the true value of $100.50. I had to see why that is and checked my code and I actually use the log>exp solution I explain in the second paragraph of "How to calculate Geometric mean?" in my calculator. It seems at least in some cases log>exp leads to more accurate results. Using my outcome you get $100.4999999999998 where the difference to $100.50 is indeed minuscule and due to errors inherent in floating number calculations, however it is many times smaller than the 0.000625 discrepancy from the canonical formula you provided. Excel seems to also be using the log>exp solution as it confirms the result from my geomean calculator: https://imgur.com/a/jIveRFA
Posted by: Georgi | 04/14/2020 at 10:06 AM
Wow. This is a lot of commenting on the fact that I wrote "33%" instead of "33.333333333%." I assumed that most people would intuit what I meant by a 33% loss negating a 50% gain.
Posted by: Yuval Taylor | 04/14/2020 at 01:48 PM
It is primarily about the incorrect average being used - the arithmetic one, rather than the geometric one. Kind of spiraled out of control, I agree :-)
Posted by: Georgi | 04/14/2020 at 10:31 PM
Excellent as usual!
Did your mutual fund data have a survivor-ship bias? That might explain why the single best predictor of future performance was low one year returns.
Posted by: Chaim | 02/23/2022 at 10:34 AM