The Data Scientist with no data!

towards-data-science

This post was originally published by Agron Fazliu at Towards Data Science

The story of today’s Data Scientist resembles a bit the Nightingale’s work — especially about lacking data while desperately needing to understand the situation of soldiers. Today, not all Data Scientists save lives, but many actually do!

Many of today’s Data Scientists most probably would welcome more data, data streams, data sources, data infrastructure, and data engineering support. Yet, they are asked to produce better insights, models, inferences, visualizations, and provide consulting on decisions.

Very often Data Scientists are frustrated not because of a model not working in production but rather because of data sources not being available — especially now in the age of GDPR, this will get worse.

There is a common denominator for this situation — lack of understanding data science. Decision makers do not understand that Data Science is not only software programming (yet!) and models are not only source-code pipelined into production.

Many companies want to become data-driven, modern, and agile. So, they hire a Data Scientist and, perhaps, a Data Engineer. This is done before collecting data, preparing data infrastructure, enabling data engineering, and knowing what data-driven business questions and strategy lies ahead. The spiral of stress ends up on the Data Scientist whose hands are tied while expectations continue to grow. A simple Google search with “why data scientists are” will suggest you complete the sentence with the word “leaving”. There is hope to see the day when this suggestion changes to something positive.

source: timoelliott.com

For aspiring data-driven companies, suggested steps for a successful data-driven transformation are:

  1. Mature, or embrace, DevOps transformation (DevOps: To do or not to do? Focus on culture first!) — this is due to culture, pipelines and data management!
  2. Roll-out GDPR compliant data lake — for storage and experimentation!
  3. Integrate data engineering capabilities — hire a Data Engineer or DataOps!
  4. Collect data having in mind compliance and GDPR — this is obvious but new too!
  5. Hire a Data Scientist!
Spread the word

This post was originally published by Agron Fazliu at Towards Data Science

Related posts