According to our Web statistics, my post on Angel Investing Returns was pretty popular, so I thought I'd dive a little deeper into the process of extracting information from this data set. At the end of the last post, I hinted that there might be some value in, "...analyzing subsets of the AIPP data..." Why would you want to do this? To test hypotheses about angel investing.Now, you must be careful here. You should always construct your hypotheses before looking at the data. Otherwise, it's hard to know if this particular data is confirming your hypothesis or if you molded your hypothesis to fit this particular data. You already have the challenge of assuming that past results will predict future results. Don't add to this burden by opening yourself to charges of "data mining".I can go ahead and play with this data all I want. I already used it to "backtest" RSCM's investment strategy. We developed it by reading research papers, analyzing other data sources, and running investment simulations. When we found the AIPP download page, it was like Christmas: a chance to test our model against new data. So I already took my shot. But if you're thinking about using the AIPP data in a serious way, you might want to stop reading unless you've written your hypotheses down already. As they say, "Spoiler alert."But if you're just curious, you might find my three example hypothesis tests interesting. They're all based loosely on questions that arose while doing research for RSCM.
Hypothesis 1: Follow On Investments Don't Improve Returns
It's an article of faith in the angel and VC community that you should "double down on your winners" by making follow on investments in companies that are doing well. However, basic portfolio and game theory made me skeptical. If early stage companies are riskier, they should have higher returns. Investing in later stages just mixes higher returns with lower returns, reducing the average. Now, some people think they have inside information that allows them to make better follow-on decisions and outperform the later stage average. Of course, other investors know this too. So if you follow on in some companies but not others, they will take it as a signal that the others are losers. I don't think an active angel investor could sustain much of an advantage for long.But let's see what the AIPP data says. I took the Excel file from my last post and simply blanked out all the records with any follow on investment entries. The resulting file with 330 records is here. The IRR was 62%, the payout multiple was 3.2x, and the hold time was 3.4 years. That's a huge edge over 30% and 2.4x!Now, let's not get too excited here. There's a difference between deals where there was no follow on and deals where an investor was using a no-follow-on strategy. We don't know why an AIPP deal didn't have any follow on. It could be that the company was so successful it didn't need more money. Of course, the fact that this screen still yields 330 out of 452 records argues somewhat against a very specific sample bias, but there could easily be more subtle issues.Given the magnitude of the difference, I do think we can safely say that the conventional wisdom doesn't hold up. You don't need to do follow on. However, without data on investor strategies, there's still some room for interpretation on whether a no-follow-on strategy actually improves returns.
Hypothesis 2: Small Investments Have Better Returns than Large Ones
Another common VC mantra is that you should "put a lot of money to work" in each investment. To me, this strategy seems more like a way to reduce transaction costs than improve outcomes, which is fine, but the distinction is important. Smaller investments probably occur earlier so they should be higher risk and thus higher return. Also, if everyone is trying to get into the larger deals, smaller investments may be less competitive and thus offer greater returns.I chose $300K as the dividing line between small and large investments, primarily because that was our original forecast of average investment for RSCM (BTW, we have revised this estimate downward based on recent trends in startup costs and valuations). The Excel file with 399 records of "small" investments is here. The IRR was 39% and the payout multiple was 4.0x. Again, a huge edge over the entire sample! Interestingly, less of an edge in IRR but more of an edge in multiple than the no-follow-on test. But smaller investments may take longer to pay out if they are also earlier. IRR really penalizes hold time.Interesting side note. When I backtested the RSCM strategy, I keyed on investment "stage" as the indicator of risky early investments. Seeing as how this was the stated definition of "stage", I thought I was safe. Unfortunately, it turned out that almost 60% of the records had no entry for "stage". Also, many of the records that did have entries were strange. A set of 2002 "seed" investments in one software company for over $2.5M? A 2003 "late growth" investment in a software company of only $50K? My guess is that the definition wasn't clear enough to investors filling out the survey. But I had committed to my hypothesis already and went ahead with the backtest as specified. Oh well, live and learn.
Hypothesis 3: Post-Crash Returns Are No Different than Pre-Crash Returns
As you probably remember, there was a bit of a bubble in technology startups that popped at the beginning of 2001. You might think this bubble would make angel investments from 2001 on worse. However, my guess was that returns wouldn't break that cleanly. Sure, many 1998 and some 1999 investments might have done very well. But other 1999 and most 2000 investments probably got caught in the crash. Conversely, if you invested in 2001 and 2002 when everybody else was hunkered down, you could have picked up some real bargains.The Excel file with 168 records of investments from 2001 and later is here. 23% IRR and 1.7x payout multiple. Ouch! Was I finally wrong? Maybe. Maybe not. The first problem is that there are only 168 records. The sample may be too small. But I think the real issue is that the dataset "cut off" many of the successful post-bubble investments because it ends in 2007.To test this explanation, I examined the original AIPP data file. I filtered it to include only investment records that had an investment date and where time didn't run backwards. That file is here. It contains 304 records of investments before 2001 and 344 records of investments in 2001 or later. My sample of exited investments contains 284 records from before 2001 and 168 records from 2001 or later. So 93% of the earlier investments have corresponding exit records and 49% of the later ones do. Note that the AIPP data includes bankruptcies as exits.So I think we have an explanation. About half of the later investments hadn't run their course yet. Because successes take longer than failures, this sample over-represents failures. I wish I had thought of that before I ran the test! But it would be disingenuous not to publish the results now.
Conclusion
So I think we've answered some interesting questions about angel investing. More important, the process demonstrates why we need to collect much more data in this area. According to the Center for Venture Research, there are about 50K angel investments per year in the US. The AIPP data set has under 500 exited investments covering a decades long span. We could do much more hypothesis testing, with several iterations of refinements, if we had a larger sample.
analyzing subsets of the AIPP dataneed