This post was originally published by Elena Marocco at Medium [AI]
Top data scientists come from a wide variety of backgrounds. Some study computer science and excel from day one at programming elegant models to analyse the data. Others study statistics and leverage their knowledge of using data to respond to a well-structured question. Nowadays, of course, many data scientists are actually studying in data science programs and develop a cross-section of all of these skills. Some, however, are like me and studied math.
When I started university, I knew math was the right degree program for me. I’ve always enjoyed math: its pure logic and the satisfaction that comes from solving an equation. Plus, it came easily to me. My mind’s always been well-adapted to math, and I was curious to explore more complex math topics.
I also knew, however, that theory was not where I wanted to focus my career. When it came time to write my master’s thesis, an opportunity to apply math to a data science problem presented itself. I was paired with a company, Evo Pricing, and challenged to use my knowledge of math to build a model that would more accurately forecast sales. I was hooked, and I’ve worked for Evo as a data scientist ever since.
Studying math gave me a broad skill set that helped me quickly adapt to practical work as a data scientist, yet there were some profound gaps that I had to work hard to bridge to succeed. I’m passionate about my career as a data scientist, and I think my mathematics background gave me a strong foundation for what I do every day. In hindsight, though, there are a few things I would have changed about my university experience. Aspiring data scientists can learn from my experience and graduate even better prepared for the real world.
At my university, math majors have an incredibly theory-centred course of studies. We don’t just learn calculations; we spend quite a large portion of our studies focused on proving theorems. These proofs may sound like something that would have no practical application outside academia, but the reality is the opposite.
When you complete a proof, you are using deductive reasoning to establish a logical certainty. A “reasonable expectation” or lower burden of proof is not enough. You must instead demonstrate that your statement holds true not just in a few key examples, but rather in all possible cases.
This process obviously teaches you the concepts behind the calculations, but more importantly, it shows you how to reason through a problem. I can apply the principles of deductive reasoning to any set of data and find a reasonable and well-supported solution. As a data scientist, this ability is critical to success. While many data scientists struggle to see the big picture in the numbers, math majors do not feel daunted by this challenge. It’s how we are trained from the start of our academic career.
Of course, there are limits to what formal mathematical proofs can teach you. When I graduated, I could practically have rattled off the 2–√2 irrationality proof in my sleep. I was under-prepared, however, to expand my reasoning to problems with more tangible applications.
When I first started my career as a data scientist, I sometimes struggled to frame the problem most efficiently and to figure out which data would best provide me with insights into the question the clients wanted solved. I had to learn the subtle differences between statistics and pure math as I went. In hindsight, I wish I had studied more statistics courses as an undergraduate. I would have had an easier time understanding the reasoning required in the business world — and how to develop a model aimed at solving a practical problem, not a theoretical equation.
Nowadays, I may benefit from the knowledge that came from discovering how to prove the irrationality theorem without words, but I doubt I could reproduce the math as quickly. What I can do is dexterously assess a broad question and create a mathematical model to analyse the data and find an optimal solution. That ultimately serves me better as a data scientist.
Above all else, math majors learn to complete incredibly complex calculations. We become comfortable using the full array of calculations and functions. I’m still most comfortable facing down a page of numbers, looking for patterns and assessing relationships between the data.
In university, we often do this with the aid of technology, yet as a data scientist, I have even more tools available to help me drill into the numbers. No matter what kind of company you work in as a data scientist, you will have to be comfortable working with numbers day-in and day-out. The same is true when you study math. Everything comes down to the numbers. In that way, my career has not changed much since I was a student.
Data science goes beyond the numbers, though. You need to understand how to create algorithms to manipulate them and analyse patterns in the data. This requires coding. While I may have learned the theory underlying algorithms and knew how to write and edit basic code, I had never studied any coding language in particular depth.
This knowledge gap was a struggle to overcome in my earliest days as a data scientist. If I could go back and change anything about my university experience now, I would add more programming courses. Nowadays, many math degrees require more programming courses as a part of the curriculum, yet I think this is still often insufficient. If you have any interest in becoming a data scientist (or having any career outside of academics after studying math), you must dedicate yourself to learning to code well.
Precision is of the utmost importance in mathematics. There is no such thing as “close enough” when it comes to solving an equation. A single flaw in logic can invalidate your entire proof or ruin your entire calculation. I learned to be as precise as I can in every analysis and never to cut corners. Anything less could lead to failure.
This desire for precision serves me well as a data scientist. We never stop working towards more accurate and more easily applied recommendations for our clients. While some may be content with a model delivering above-average accuracy and a significant lift in performance, my math background means I’m never satisfied with “good enough”. I’m always trying to achieve more precision. Whether it is improving how we collect and use data or conduct our analyses, I am always searching for that extra accuracy.
Unfortunately, problems in data science cannot always be solved with 100% certainty. While mathematical proofs have an elegance and ultimately a clean solution, the kinds of issues you try to resolve as a data scientist rarely have an entirely dependable answer.
For example, my earliest work at Evo was developing a model to get the exact number of products in stores at the exact moment they are needed. We developed an incredibly accurate model — with store manager input it averages 94% inventory efficiency, representing a +25 p.p. increase over the traditional replenishment system — yet it will never achieve perfect allocation with 100% forecast accuracy. It’s simply impossible with the tools that exist today.
It takes an adjustment in the proof-driven mindset to become more comfortable in the imperfect business world. My math degree did not prepare me for the kinds of inferences that we must make to create a practical recommendation for our clients. Studying math may not have fully prepared me for the reality of how we calculate success in the real world, but it did hone my instincts well. We can always work towards perfection, even if we may not achieve it.
If you are studying math, you are building a strong foundation for a successful career as a data scientist. At Evo, a large number of my teammates come from a math background, including the CEO of Evo Fabrizio Fantini, whose PhD thesis in mathematics later became the basis for the work we do. Studying math made me a better data scientist, and I don’t regret my degree for a second.
Still, you must be careful not to make the same mistakes that I did. Enrich your math degree with the programming, statistics and generally less theoretical courses I lacked. You’ll be ahead of the curve when you apply for data science jobs in the real world — and start yourself up to go as far as you can dream in our field.
This post was originally published by Elena Marocco at Medium [AI]