Keys to success when adopting a pre-existing Data Science project

towards-data-science

This post was originally published by Rose Day at Towards Data Science

I recently adopted an extensive collection of notebooks that combined aid in the creation of analytics. It sounded like a daunting project to take on, but the more I read into the code, the more I realized it wasn’t all that bad. The notebooks looked overwhelming, but the code was relatively simple when broken down into smaller, more manageable chunks.

As I adopted this set of notebooks, the first thing I did was to read it. I spent half a week looking through every notebook, reading every line of code at least three times, and breaking down the flow of how one notebook transitioned into another. I wanted to make sure I understood the inputs needed to create the code, the expected outputs generated from the code, and the codebase’s overall architecture.

As most of the original developers were gone, I leaned on the one person I could to ask as many questions as possible. These questions and answers broke down any misunderstandings or confusion I was having and allowed me to better architect the code’s future state in my head. I began to see the broader picture of the code and how it could be utilized moving forward.

As you adopt someone else’s project, it is good to keep in mind that they may have only had one use-case in mind when they first developed the code. Knowing what this initial use-case can help you understand how to translate the code into a more generalized solution allows for code reuse and expansion. You may not be a software developer, but you can start thinking like one. Sit back and consider the areas that may appear multiple times in the code. Can you create a function or class from that? As you work through the design of the code, document everything. It may seem tedious at times, but documentation can be the key to onboarding new teammates and making sure they can quickly pick up any code and use it.

Spread the word

This post was originally published by Rose Day at Towards Data Science

Related posts