Consider the well-known "Super Bowl" indicator: Folk wisdom says that in years when the Super Bowl winner is a team from the original National Football League (now the National Football Conference), the market will have an up year. In fact, this indicator has been correct roughly 80 percent of the time--far outpacing any track record put up by Wall Street's brightest. Yet, despite the strong correlation between the Super Bowl winner and that year's market direction, no sane financial advisor would put client money at risk based solely on the results of the game. Why? Because it is a tough case to make that the outcome of the Super Bowl caused the market's performance in the ensuing year.
While the distinction between cause and correlation is pretty clear in the Super Bowl indicator, the line gets fuzzier when we consider some of the strategies being employed by exchange-traded funds. Note that word: strategies. Yes, we all know ETFs are based on underlying indices; however, many of these are actually investment strategies masquerading as indices.
To be sure, ETFs based on long-standing indices exist. The largest ETF in terms of assets, the SPDR (with some $66 billion in assets), mimics the venerable S&P 500 market-cap-weighted index. However, many ETFs--including a large portion of the 156 ETFs brought to market in 2006--follow "indices" that bear little resemblance to a traditional index. And many of these employ distinct strategies for both component selection and weighting.
For example, the Claymore/Clear Spin-Off ETF (CSD) tracks the Clear Spin-off Index. This index, according to the prospectus, is designed to "actively represent the stock of a group of companies that have recently been spun-off from larger corporations and have the opportunity to better focus on their core market segment and outperform, on a risk-adjusted basis, the Russell Mid-Cap Growth Index and other mid-cap-oriented benchmark indices." The index uses "multi-factor proprietary selection rules to seek to identify those stocks that offer the greatest potential from a risk/return perspective. . . . The 40 highest-ranking stocks are chosen (based on this methodology) and given a modified market cap weighting with a maximum weight of 5 percent."
Clearly, this is not your father's index. Indeed, at the end of the day you don't even really know how the components are selected, other than via some quant model whose description, at best, is rather murky.
Now, just because an ETF is based on a strategy does not make it a bad holding. And the Claymore/Clear Spin-Off ETF may fill a nice niche in a client portfolio. But make no mistake: An investment in this ETF is not, in the traditional sense, an investment in an index. It is an investment in a strategy.
But how do ETF providers sell these strategies? A big marketing tool for ETFs is their "back-tested" track record. Back-testing is the process by which a particular investment methodology is applied to the past to determine if the methodology would have produced superior historical returns. For example, the appeal of the Super Bowl indicator is that when back-tested over the last 30-plus years, it shows an uncanny ability to predict the market.
Of course, the back-testing methodologies employed by exchange-traded funds are more sophisticated than the Super Bowl indicator. ETF providers employ a variety of back-testing devices. One important aspect of this is data mining. This is the process of going through reams of historical data to find metrics and investment strategies that correlate with producing superior returns. Rest assured that, in coming up with its investment methodology, the Claymore/Clear Spin-Off ETF creators pored over tons of data and looked at the back-tested returns of many possible methodologies before finalizing its component selection and weighting process.
But just as with the Super Bowl indicator, advisors should always question whether historical results generated via back tests represent merely correlations or sustainable alpha generation.
In general, there is no problem with using back-tested strategies. (Full disclosure: My firm, Horizon Investment Services, back tests a number of strategies when producing new products.) Many back-tested methodologies have put up solid real-time numbers. However, with the proliferation of ETFs and the use of back-tested results to help differentiate ETFs in the marketplace, advisors need to approach an ETF's back-tested results with a skeptical eye. Here are some questions to consider when evaluating the back-tested returns of ETF indices:
Does the back test cover a meaningful time period? Go to any ETF-sponsored Web site, and you'll find the historical performance of the underlying index for a particular ETF. You might see historical performance going back to 1992 or even earlier. Of course, since the first ETF wasn't even born until 1993 (and more than 40 percent of the 359 ETFs in existence at the end of 2006 have not yet seen their second birthdays), what you are viewing is not an actual "live ammo" performance of the ETF index, but simply historical, back-tested performance. It is important to consider the period of the back test. Does it cover different market cycles? Does it cover market periods when different investment styles (growth versus value, large cap versus small cap) were in or out of favor? All things equal, a longer back-test period is better than a shorter one. Surprisingly, as the table above shows, several ETF indices, including a number of niche ETFs brought out in recent years, have historical back tests of six years or less.
Does the back test avoid "survivor bias"? Remember that back testing takes a methodology and applies it to the past to see how it would have performed. In order to maintain the integrity of the back test, it's essential that the historical universe you are testing was, in fact, the actual universe that existed during that time. Let's say you are applying your methodology to the year 1995. You want to make sure the universe of stocks that you are using in your back test is exactly the universe that was available in 1995. Unfortunately, some data sets may not include stocks that either were acquired or went out of business after 1995. Rather, the data set includes only stocks that "survived" since 1995 and applies the methodology only to those (1995) stocks that are still around today. Of course, this survivor bias can skew historical back-tested results dramatically. If you have questions about whether the back-tested results of a particular ETF might have survivor bias, call the ETF sponsor and ask how the back test was performed.
Is the back-tested methodology dependent on just one or two metrics or multiple variables?" Mark Hulbert of the Hulbert Financial Digest provides a great story about data mining. Hulbert writes that several years ago, David Leinweber, a visiting faculty member at CalTech's economics department, wanted to illustrate the perils of mining data for spurious correlations. Leinweber searched through all the data on a United Nations CD-ROM to find the indicator with the most statistically significant correlation with the S&P 500. His discovery--butter production in Bangladesh. If you see a back-tested methodology for an ETF index--especially for a "specialty" ETF--that is based on just one or two metrics, beware. Good methodologies generally will use multiple variables (at least three) in the selection process. The reason for multiple variables is that you don't want a stock that scores well in just one variable making the final cut.
Ideally, you want stocks that have exemplified broad-based strength by scoring well on several metrics. Also, good methodologies will generally include metrics that work across a variety of style boxes. For example, you don't want to see an index methodology that uses four value metrics or four growth metrics to pick its components. Better to see both growth and value metrics as part of the selection process. Back-tested methodologies should include metrics that generally work well across a variety of stock universes, such as large and small caps, growth and value. Over time, such metrics as price-to-cash-flow and price-earnings ratios tend to work well across a variety of stock universes. Back-testing methodologies that rely on esoteric metrics--like butter production in Bangladesh--have a lower degree of credibility.
Is the back-tested performance driven by a couple of "outlier" performance years? When looking at historical performance of the index, make sure that its outperformance is not the result of a couple of big years. It is not unusual for ETF providers to use "mountain charts" in their brochures to display historical back-tested performance. Mountain charts are line charts that show compound annual returns. The problem with mountain charts is that a couple of good years can have a huge impact on the overall chart. Make sure you look at how the index performed each year in the back test, paying special attention to the number of winning periods versus losing periods relative to the appropriate benchmark, as well as the magnitude of the down years.
What is the historical volatility of the underlying index? Just because a particular ETF strategy has generated outstanding back-tested returns does not make it an appropriate holding for your clients. You need to consider risk. Good back tests will provide both back-tested returns as well as back-tested volatility and standard deviation of returns. If you don't see that information on the Web site, call the ETF sponsor.
Given the infancy of most ETFs, advisors have very little in the way of real-time track records on which to base decisions. Thus, it's only natural that back-tested returns of an underlying index will likely receive greater weight in the decision process. The good news--when it comes to ETFs and back testing--is that much of the work is being done by individuals and organizations that have a high degree of credibility in the investment world. Indeed, academics (such as the Wharton School's Prof. Jeremy Siegel, who recently became part of the WisdomTree family of ETFs) and respected investment practitioners (such as Robert Arnott of Research Associates, who has developed a variety of fundamentals-based indexes for PowerShares) now populate the ETF world. This should give advisors a certain degree of confidence that the back-testing is being done properly.
Still, never forget that back-testing, after all, is an attempt to move forward using the rearview mirror. And we all know how dangerous that can be.
Chuck Carlson, CFA, is chief executive officer of Horizon Investment Services and the author of Winning With The Dow's Losers (HarperBusiness). David Wright, CFA, provided research assistance for this article.



