I recently finished a fascinating
Econtalk podcast with Joshua Angrist dealing with different methods in Econometrics and how slowly we have gained knowledge in the field over time. The podcast had me reflecting on four of my statistics-related experiences. All are examples of hubris or folly (primarily mine) in the use of statistics.
Disclaimer: My work is not this disgraceful all the time. I label these "learning experiences" for a reason.
1) Econometrics Competition - This might surprise most of the people who know me, but I once won an econometrics competition. Each candidate needed to submit a model together with the theoretical reasoning to justify along with their data. The product we were dealing with was somewhat of a signalling/luxury good, so my model included an exponential component as well as an attempt at instrumental variables. I somehow won. However, when actually applying the model later in the real world, it was simply mediocre and no better than many of the other submissions. Getting the good job sticker sadly did not enhance my ability to predict the future.
2) Predicting tax revenue - In college I worked on a predictive tax revenue model for a nearby municipality. The municipality needed to decide a budget in August of the preceding year, but it was still unaware of what tax revenue, primarily sales revenue, would be for the next year. My team and I were able to create a very strong predictive model. There was just one problem. Our largest errors from prediction were quarters in the two final years, 2001 and 2002. There was this invention called the internet, and online sales started to heavily impact municipal sales tax revenue. The model still performed decently, but the world is not a statistical distribution with 11 fixed primary variables that were true for the last 15 years and will be true for the next 15 years. The world is a complex place, and what makes it complex is not just the randomness and noise, but also the unprecedented. This same lesson applied to the airball in AAA rated mortgage-backed securities five years later.
3) Inverted demand curves - Another project I once worked on dealt with perishable goods that actively changed prices. I was assigned to come up with a type of elasticity and competitive response framework, but I was only provided the company's pricing data. After about 20 hours of fiddling, I discovered that the demand curve was inverted. The more the price was raised, the more quantity sold increased. I discovered the ever elusive
Giffen good! I wasn't quite that naïve, but I didn't take the time to structure my thoughts and the request before diving in. Then the realization came, "Of course. We raise prices in anticipation of higher volume." This isn't randomized data that they created for this test. They just want after-the-fact justifications. Also, without competitive data and other critical pieces of information that drive sales, even a randomized experiment would likely lead to incorrect results, as the pricing effects would likely be overwhelmed by the noise of holidays, competitive price changes, weather, advertising, etc. I wasted time with a comically obvious error because I was
"thinking fast" before I was "thinking slow." Just because one's task is thinking doesn't mean one is being thoughtful.
4) Willful Ignorance - I was performing a project for a company that was to be acquired. During this work I discovered that the revenue from the existing customer base was on a downward trajectory and the rate of new customer acquisition was obviously slowing (very negative second derivative) in every existing market in which the company participated. The company was making up for this revenue by incrementally adding a few medium-sized markets periodically to the mix, but with every round these new markets were less and less favorable for the company. I took this analysis with some statistical tools and presented it to some senior management members. I was immediately shut down with explanations like, "well, there's no seasonality control here" and other explanations that did not hold any water whatsoever. Shortly thereafter I was reassigned and then pushed out. It wasn't until after that I had realized the obvious, "They were selling the company, you nincompoop. Of course they don't want to provide ammunition to their buyers."
Statistics is a tool. Outside of a laboratory, and even sometimes within a laboratory, it's a very imperfect tool, and sometimes an irrelevant tool. The future is complex and filled with new challenges and people with their own agendas. If you don't stop and look around once in a while, you could miss
it.