This post was originally published by Yin Zhang at Towards Data Science
Conducting a data science/analytics project always takes time and has never been easy. A successful and comprehensive analytics project is way beyond coding. Instead, it involves sophisticated planning and a large amount of communication.
What is the Life Cycle of an Analytics Project?
To complete a data science/analytics project, you may have to go through five major phases starting from understanding the problem and designing the project, to collecting data, running analysis, presenting the results and doing documentations and self reflection.
I have a strong feeling that running an analytics project is pretty similar to building a house. First, the architect meets his/her client, understands their needs and comes up with an actionable blueprint (Understand and Plan).
Then it requires collecting building blocks such as cements, steels, bricks… etc. You have to learn the features of your building materials and choose the right materials for construction. Otherwise, you may end up having a house that can be collapsed easily. This is like a data collection process where you have to do some EDA or feature engineering to understand data and find the right data to solve your analytics problems, or else you may not manage to get solid or concrete results from your analysis!
With the building materials and blueprint handy, you can start building a house (Run Analysis). After construction is finished, home Inspection and quality checks are required to ensure safety. Similarly, we need to document our analytics project regarding the methodologies, conclusions and limitations.
Understand & Plan
If I’m asked the most critical phase of the whole cycle, I would say Understanding and Planning without any hesitation because the main purpose of data science and analysis is not to create a project with fancy technology, but to solve real problems. Therefore, the success of an analytics project is highly dependent on how well you understand the situation, define the problem and translate the business questions to an analytics question. From that standpoint, it’s always worth spending time thinking about the broader context of your analytics project.
- Ask Questions
Usually an analytics project starts from a kick-off meeting where you meet with business partners. They will provide some context and briefly talk about what they are looking for. Asking smart questions will always lead you to have a better understanding of the pain points and requirements from your business stakeholders. Here are some sample questions you can ask.
- Analytics Plans
Before diving into the analysis, let’s come up with an analytics plan and set up another follow up meeting to recap questions and reinforce expectations.
It will provide a high-level overview of the plan, giving a clear picture of the next steps and draw the link between technical actions and the bigger picture from the business side. Here are some key elements in my Analytics Plans:
Don’t make data science and technology a mystery. Try to include a brief description of the methodologies with layman words, outline the use cases and scenarios, and summarize the strengths and limitations. Avoid complicated formulas or functions but focus on how that methodology can help solve the problem.
It may take a few weeks to finish an analytics project, sometimes even longer. And the longer a project will take, the greater the chances of some variable throwing a spanner in the works. Therefore, before your project gets rolling, the first thing is to unpack all the steps that you have to do to get done with your project. Then estimate the desired time frame to finish those tasks and mark some milestones.
Rather than just doing one final presentation, you are recommended to set up some check-in points in the middle of the project to engage with your stakeholders and get their feedback, so that you can do timely adjustment.
How to Organize the Project and Track the Progress
- Leverage Project Management Tools
When you are working on a complicated analytics project and need to collaborate with multiple teams such as engineering, product and business etc, a simple to-do list or an excel based tracker will fall short.
However, you can leverage dedicated project management tools and software. If you go to google and simply search Project Management Tools, many tools will grab your attention, such as Asana, Trello, JIRA, Monday.com etc. I bet you will find the right tool based on the size of your team and the way you prefer to work. But, Notion wins my heart simply because it is an all-in-one workspace that blends several work apps into one. I do really hate having multiple tools for different purposes, so that you can imagine, ‘All-In-One” is the most effective marketing message to convert me.
It is a fantastic option for me to do planning, tracking, knowledge sharing and blogging in one place. Believe it or not, it also provides templates to facilitate travel planning and assist you to track the job application progress.
- Sprints Planning
You have a powerful tool with you. It’s like your personal assistant. How to ensure the collaboration and make the wheel running? Let’s talk about sprint planning.
For those who are not familiar with the concept of sprint, A sprint is a short, time-boxed period when a team works to complete a set amount of work.
Sprints make projects more manageable, allow teams to ship high-quality work faster and more frequently, and give them more flexibility to adapt to change.
First and foremost thing to conduct sprint planning is to decide the sprint length. Even though there is no hard rule as to how long each sprint should be, it has to be long enough to ensure that tasks can be completed meanwhile it should be short enough so that the requirements and the goal will stay the same.
In the sprint planning meeting, the whole team will align the goals in the coming sprint and plan the work that will contribute to that goal. The tasks will be itemized, prioritized, assigned to the team members and logged into the board of the coming sprint.
- Daily Stand up
The Sprint planning is to define what goals will be achieved and what tasks will be delivered to drive the progress of your project, whereas a 5 to 10 minute daily stand up meeting will help align planning with execution. More specifically, it’s to get teams on the same page, clarify priorities and avoid some common blockers.
To keep it short and quick, here are three cornerstone questions that everyone will answer:
- What did you do yesterday?
- What will you do today?
- What blockers stand in your way?
Don’t overlook the Importance of Documentation and Reflection
I have been in the data analytics/ data science area for just over six years. One thing I wish I’d known when I started my career is that the documentation and reflection is equally important as the analysis itself. As all the analytics projects you have done are potentially going into your resume as a shining project and then they will be the talking points for the interviews, summarizing and framing your analytics work once it’s done is super beneficial for you to refresh your memory, consolidate all the materials and structure stories.
- Document and Frame Your Analysis
You can follow the framework with Situation, Problem, Solutions and Next Steps (SPSN).
First, describe the current status and pain point, then picture the problem and shout out the business impact of this project.
The most critical part is the Solution where you outline the methodologies at a high level, and provide details regarding data and analysis step by step. Then document the results, insights and actionable recommendation generated from your analysis.
Please be noted that there’s no perfect data science project. Be open-minded to talk about the caveats and the limitations of your project. It’s great to shout out questions that can be addressed by your analysis, meanwhile it’s also valuable to learn what kind of questions can’t be answered.
Last part is the next step. Potential next steps will be testing new methodologies to improve the accuracy, including new data sources, or Automating the whole process to make the results pop out everyday etc.
Meanwhile, don’t forget to do self reflection by creating a story grid and mapping some examples and circumstances you’ve encountered from each of your analytics project to below five categories.
Hope it helps to facilitate the whole process of your analytics/data science project!
This post was originally published by Yin Zhang at Towards Data Science