(The following is a Part 2 of a series on how the strategic enrollment management industry is putting higher education at risk. Be sure to also read the informative Series Introduction as well as Part 1, Treating Adolescent Decision-Making as Linear.)
Not Adequately Testing Models in Real World Scenarios
Simply put, prediction is a method for using known information about the past to estimate a similar outcome in an unknown future. In practice, this involves fitting a mathematical equation to a set of historical data, and then applying that same equation to data for which the outcome is still unknown.
The challenge, of course, is that the outcome in the future is likely to be similar, but not identical, to the outcomes of the past. Therefore, it is necessary to build a model that is closely aligned with the historical data but not too closely aligned. This process is called “specifying” or “fitting” a model, and a clothing metaphor is apt.
A model that is “underfit” to your data is like a baggy sweatshirt; its fit is too general to show off a great body. A model that is “overfit” is like a neoprene wetsuit leaving very little room for adjustments (and less to the imagination). The former will not take full advantage of the predictive power of the data, and the latter will take so much advantage that it falls apart when it is applied to data it hasn’t seen before.
So how do we solve this problem? Glad you asked.
In the world of enrollment prediction, the body of historical data is called the training set — in that the modeler uses it to “train” the model to know what to expect when it comes time to consider new data at prediction go time. But, how do you choose how much data to use for model specification and how do you know how well the model performs?
The method to which we are partial starts with historical data of no less than three years, which is then sub-divided into a training set and a testing set that is considered a “holdout” because it is completely firewalled from the data used to build the model. Once the model is trained, the model then gets a dry run on the holdout set and thus provides us a deep understanding of that model’s accuracy, which can help us make the model even better.
Here again, the nerd battle rages in terms of what is the ideal way to build a more accurate model. Traditional causal inference prediction methods often utilize a few years of data but use the entire historical training set to build the model. Any testing that is conducted is done through a method called resampling, in which data are randomly selected from the training set itself and tested against the model. The problem with this is that resampling is likely to overstate how well the model is likely to perform in the future.
Traditional causal inference models that currently dominate the enrollment prediction space are simply inadequate at generating the most accurate models primarily because they are not tested in game-day scenarios. Their predictions are assessed on the data that was used to construct them, which leads to high expectation setting and low achievement of those expectations.
Tomorrow, in Part 3 of our blog series on How Enrollment Predictions Are Driving Colleges Out Of Business, we will discuss the futility of “Trying to Forecast October’s Weather on January 1.” We predict you’ll find it interesting.
By Thom Golden, Ph.D., Vice President of Data Science; Brad Weiner, Ph.D., Director of Data Science; and Pete Barwis, Ph.D., Senior Data Scientist, Capture Higher Ed