This post was originally published by at Towards Data Science
Starting out as a data scientist, I struggled to understand the value economics brings. Now that I understand that data science is far more than knowing how to code, I’ve been able to identify the value that economists such as myself can bring to data science and machine learning. This article is an effort to help economists explain the value they can bring to machine learning roles, as well as help non-economists in data science understand what an economist can bring to the table.
If you’ve ever taken an economics class, you may have heard of economics defined like this:
“the branch of knowledge concerned with the production, consumption, and transfer of wealth.”
If you haven’t, you likely associate economics with labels such as “finance”, “GDP”, and “stock markets”. The field however — and something I love about economics — is far broader than these terms lead on. Economics touches history, geography, business, politics, psychology, marketing, and dozens of other subjects. If you want to dive into the possibilities, check out the Freakonomics podcast. Analysis often includes consideration of the constraints and probabilities surrounding outcomes and effects.
The breadth may lead you to believe that economists think they know everything (and in some cases, you’d be right); that they have a solution to any problem. I’d suggest though that what they really have is a problem-solving framework. The framework includes cost-benefit analysis, cost and output optimization, impact studies, game theory analysis, and — running through each of these — econometrics. Throughout the article, I’ll refer back to a classic data science problem of predicting home prices (a variant of which is also a problem that economists can solve).
My use of the word “economist” refers broadly to individuals who have completed a graduate degree (Masters or Doctorate) in some facet of economics, or who have formally worked as economists. Many of the skills I’ll describe below aren’t sufficiently developed at the undergraduate level (based on my personal experience).
Econometrics is a form of statistical analysis with which economists are able to infer the marginal effect of some action on an outcome. Here’s an example: Consider the extra room for my house I mentioned earlier. An economist with data from homes in the neighborhood, city, or larger geographic area may be able to determine that, holding all other factors constant (such as location, square footage, etc.), building one more room (this is the ‘marginal’ part) on your house will increase its value by ‘X’ dollars (this is the inference).
You might already be imagining situations in which this could be helpful. But if not, here are a couple of examples to get the gears turning:
- If a company has $1000 in advertisement funds, which advertisement channels should be used to maximize value of those funds?
- An insurance company may base their rates on the probability that each client get in a car accident. What factors contribute most to an increase in probability (and to what degree)?
While econometrics typically focuses on these marginal effects contributing to an outcome, machine learning methods typically focus on the prediction of the outcome itself (in our original example, the cost of the home). Both econometrics and machine learning use tools such as regressions, decision trees, and other algorithms, but view results through different lenses. Consequently, each lens has its separate use case, but it enables economists to transition much easier into machine learning than those without a background in inferential statistics.
When we discuss bias, we’re talking about some difference between the predicted values and actual values. In the case of our house price example, bias might show up when:
· Your dataset consists of only very large or very small homes, but you want to predict prices for any home
· You fail to include important variables in the dataset, such as zip code (“location, location, location”, right?)
· You throw every possible variable into your model in hopes that it will account for anything important in the dataset
Economists, especially when running statistical models, are trained to give careful consideration of potential bias among data, theories, and outcomes. Bias can be introduced at any point in the data life cycle and can lead to over/under-inflated results, or even those that are entirely misleading. Formal training in bias gives an economist two things:
1) Experience considering reasons why a current study/dataset/algorithm might be problematic, as well as the tools to quantitatively detect that bias,
2) Development of a healthy skepticism and an ability to question results instead of taking them at face value
An ability to test for and identify bias early on in the analytics process enables economists to choose and implement machine learning algorithms more quickly than in a trial-and-error situation. For good visual representation of some types of bias, check out this article.
Economics is intertwined with business. It follows then that, while not an MBA, economists often have a reasonable handle on how businesses generally function, why they make decisions, and the types of decisions are broadly necessary. After all, economics is a study of incentives (for both businesses and consumers). This is especially the case when it comes to subjects such as price, cost, revenue, profit optimizations and market studies. Economists working as data scientists thus have domain knowledge with which they can be particularly effective in related applications. For economists working on applications unrelated to these subjects, they have the tools with which to develop a valid business case for a model’s usage.
In our housing example, an economist may say, “that’s great that we want to create a model that predicts housing prices, but what value does it bring to our business? Does knowing the price of a home increase our ability to identify ‘good deals’ and increase our profits? Does it draw more customers to our website? Or reduce costs in current evaluation methods?” An economist’s awareness of these questions allows them to have a more educated discussion with management about the values of model development and to maintain focus on the things that matter most to the business.
Economists are trained to consider the consequences of business’ actions. These consequences are often referred to as “externalities” and may be positive or negative in nature. The training, though, is especially critical in considering the implications of machine learning models. There’s a famous case in which Target is able to predict whether a woman is pregnant based on her purchase history, and even the accurately predict due date. Economists are typically interested in understanding what effects such a model released to the public would have on the company’s business. Some of these effects we know about, but with a technology as new as AI, industry is still working to identify AI’s impact. While some may implement models without considering the externalities, an economist’s careful consideration may lead to significant savings (both in money and in reputation).
Mathematical Modeling of Complex Systems
Economists are often faced with the problem of mathematically describing how complex systems work. These systems may seek to answer questions regarding the supply and demand of goods or even why businesses are making certain decisions and what decision they may make next. In almost any case, these systems operate under constraints which the math must also account for. The grasp of mathematics developed by modeling these systems allows economists a level of comfort with the math of machine learning. They’re accustomed to taking an abstract concept and creating a set of rules that may govern that concept. These skills are directly transferable to algorithms selection in machine learning processes.
While many people looking at machine learning and data science roles focus solely on the coding portion of data science, this can be a dangerous practice. Economists have a unique set of skills that allows them to apply statistics in a very similar fashion to data scientists, while also being able to evaluate bias, externalities, and comprehend the math behind algorithms. Additionally, their experience analyzing businesses provide an ability to question and discuss the value of models relative to business goals.
If you have any questions about the connections between economics and data science, leave them in the comments, and I’ll try to answer them!
This post was originally published by at Towards Data Science