Ockham’s Spatula


This post was originally published by at Towards Data Science

The science and the art of model deployment.

Model building is like climbing a mountain. It’s what you spend so much time planning for. It’s what everybody wants to talk about. It’s what gives you that euphoric feeling of accomplishment when you’re finished. But just as mountain climbers have to descend, model builders have to deploy. You have to put your model in a form that will be usable (and palatable) to users.

Sometimes things don’t work out the way you hoped.

I had a client, a very skilled engineer, who wanted a model to predict how many workers he would need to hire during the year. His organization produced three lines of products, most of which were customized for individual customers. A few years earlier, he had gone to great effort and expense to develop a model to predict his manpower needs. He collected data on how many of each type of product he had produced over the past five years and from that data had his managers estimate how long it took to make each product and complete the most common customizations. Then he had his sales force estimated the number of orders they expected the following year. He reasoned that adding up the time it took to produce a product multiplied by the number of expected orders would give him the number of manhours he would need. It was a classic bottom-up modeling approach.

But … It Didn’t Work

The model had a problem, though. It didn’t work. Even after tinkering with the manufacturing times and correcting for employee leave, administrative functions, and inefficiency, the model still wasn’t very accurate. Moreover, it took his administrative assistant several weeks each year to collect the projected sales data to input into the model. Some of the sales force estimated more sales than they expected to try to impress the boss. Others estimated fewer sales so that they would have a better chance of making whatever goal might be given to them. A few avoided giving the administrative assistant any forecasts at all, so she just used numbers from the previous year.

Using a statistical modeling approach, I found that his historical staffing was highly correlated to just one factor — the number of units of one of the products he produced in the prior year. It made sense to me. His historical staffing levels were appropriate because he had hired staff as he needed them, albeit somewhat after his backlog reached a crisis. His business had also been growing at a fairly steady rate. So long as conditions in his market did not change, predicting future staffing needs was straightforward. He didn’t need to rely on projections from his psychologically-fragile sales force.

Failure Was Not In the Numbers

But my model proved to be quite unsettling to many. The manager of the product line that was used as the basis of the model claimed the model proved his division merited a greater share of the organization’s resources, and bigger bonuses for him and his staff. Managers of the two product lines that were not included in the model claimed the model was too simplistic because it ignored their contributions.

At that point, the client had a complex model that he liked but didn’t work and a simple model that worked but nobody liked. He probably would have continued to use the complex model if it didn’t take so much work to gather the input data. Valid or not, the simple model had no credibility with his managers. He could calculate a forecast with the model but was reluctant to favor the model over the intuitions of the managers. So given his two flawed alternatives, the client decided to move manpower forecasting to the back burner until the next crisis would again bring it to a boil.

Lessons Learned

I wish I could say that this was an isolated case, but it’s more of a rule than an exception especially with technically-oriented clients who are most comfortable working from the bottom details up to the prediction. Domain expertise is necessary to any modeling effort. It’s needed to guide the selection of model inputs and outputs. But it shouldn’t dictate how a model is developed. That is the realm of the model-builder.

I once developed a model for a client to predict the relative risks associated with real estate that they managed. The managers wanted a quick-and-dirty way to set priorities for conducting more thorough risk-evaluations of the properties. I based my model on information that would be readily-available to the client. They could evaluate a property for a few hundred dollars and decide in a day or two whether further evaluation was needed immediately or whether it could be deferred.

Enter the Experts

When the model-development project was done, the model was turned over to the operations group for implementation. The first thing the operations manager did was invite experts he worked with to refine the model. Very quickly, the refinements became expansions. The model went from quick-and-dirty to comprehensive-and-protracted. It took the operations group on average of $50,000 (this was in 1980, so it would be about $168,000 today) over six-months to evaluate each property. The priorities set by the refined model were virtually identical to the priorities set by the quick-and-dirty model.

Was one of these models good and the other bad? Not exactly. There’s an important distinction to be made. Scientists, engineers, and many other professionals are taught that, all else being equal, simple is best. It’s Ockham’s razor. A simple model that predicts the same answers as a more-complicated model should be considered to be better. It’s more efficient. But sometimes, you as the modeler, have to be more flexible.

Lessons Learned

The operations manager wasn’t comfortable with a simple model. He needed to be confident in the results, which, for him, required adding every theoretical possibility his experts could think of. He didn’t want to ignore any sources of risk, even if they were rare or unlikely. That made for a very inefficient model, but if you don’t have confidence in a model and don’t use it, it’s not the tool you need.

Here’s a side note. The project described was one of the first big modeling efforts I ran at the beginning of my career. Almost forty years later, the exact same thing happened in the last big modeling effort I ran in my career. In 1980, I went apoplectic. The client changed the rules. What was he doing? Yes, I was young and stupid. In 2019, I managed to stay with the model (which I couldn’t in 1980). I walked the client through every change he and his committee of experts thought would help (I had an all-possible-regressions software option that I didn’t have in 1980), and showed him how the changes actually made the model performance worse. Of course, he won in the end. He knew I was about to retire, so he waited me out. That’s fine, he’s the one who needs to live with it. I only need to write about it.

These cases illustrate how there’s more to modeling than just the technical details. There are also artistic and psychological aspects to be mastered. Textbooks describe methods to find the best model components but not necessarily the ones that will work for the model’s users. Sometimes you have to be flexible. Think of Ockham’s Razor as more of a spatula than a cleaver.

Models are like sausages— they look good on the outside but you don’t know what might be on the inside.

Like sausages, models need to look good on the outside especially if there are things on the inside that might make most users choke. You have to package the model. First, it can’t look so intimidating that users break out in a sweat when they see it. Leave the equations to the technical reviewers; hide them from the naive users. USDA inspectors have to look inside sausages, but you don’t. Second, put the model in a form that can be used easily. That inch-thick report may be great documentation, but it’ll garner more dust than users. If your users are familiar with Excel, program the model as a spreadsheet. If you know a computer language, put the model in a standalone application. A model is only as successful as the use to which it is put.

Spread the word

This post was originally published by at Towards Data Science

Related posts