Integrating data science — What is data science?

Gidi Shperber
4 min readAug 28, 2019

--

Welcome to the series “How to integrate data science in your organization”.

This series is intended to provide bite-size articles on integrating data science in any organization, without loss of generality.

I’m eager at least as you are to jump in and start this journey. However, I believe this series can’t be complete without first explaining what is data science, at least as I see it. And fortunately enough — there is no formal definition.

Most of the articles about data science are written with data scientists in mind, but this one is more organization/management oriented.

Trying to define data science

So data science is basically about producing business value from your data. In any kind of way.

Data science started somewhere around 2007–2009 in 2 companies, Facebook and LinkedIn, by 2 people: Jeff Hammerbacher in Facebook and DJ Patil in LinkedIn. Both lead teams that were earlier titled as “data analysis”, “business analytics”, “BI”, “Statisticians”. However, they’ve engaged tasks that were in the core of these companies. E.g — the people you may know feature on LinkedIn. So they’ve felt that these titles don’t provide a good description of what they are up to. And still to this day, there is somehow grey area between these roles and data scientists.

What are the big differences?

  1. First and foremost, the role definition. data scientist are there not to provide dashboards for management, fancy reports, or ad hoc analysis (although they might do it as well) their main job is to provide business value by data pipelines.
  2. Actionability — whether a data scientist provides a research, a feature or a model, his output should be actionable and measurable, and in many cases he should be able to provide the final product.
  3. Data science have to stand up a pretty high qualification: they have to have high level of math and algorithms, high software development skill, and they have to have good business understanding and communication skills

You get the picture.

An example

Say you work in a truck company, and you see tomorrow is going to be rainy, so you cancel rides of trucks that are exposed to rain, to keep the cargo safe. Did you use data science? Yes! very simple variant of it though.

You look at a daily basis on the miles-per-month by your drivers, and you give bonus to the most motivated ones? Data science! still pretty basic.

Now you go even further, and you run an optimization algorithm, which calculates what is the optimal distribution of bonuses to give you the highest mileage (which you profit from)? That’s better.

Sub-fields

So we can be happy with our tiny example, however there are countless other use cases, some are simple and some are very complex. We are only going to mention a few to give you a taste:

  1. Computer vision
  2. Text analysis
  3. Recommendation engines
  4. Times series analysis

And more.

Dependencies

All of the above comes with a cost. meaning, you can’t put a statistician in front of an excel sheet and expect these kind of results. Data science requires cross organizational commitment, highly skilled personnel, intensive data collection, cleaning and verification, and constant testing of results. This is something not started in one day.

Skills

Trying to address the business side, we almost forgot the most important ingredient: the data scientist. A data scientist is currently a strange creature, since she acquired her skills mostly in a formal way. Since data science was “declared” around 2008, schools started to respond and initiate data science Bsc and Msc around 2015. Therefore, data scientist who stated their academic education before 2015 (currently 99% of data scientist) have no “data science” degree in their curriculum. What do they have? mostly Computer science, statistics, math, and others Msc’s.

But the academic background is not good enough. The important part are the skills. A balanced share of math, programming and business expertise are required. this will be much elaborated later.

Conclusion

We’ve seen that data science is simple by definition, but complex to execute.

So now it’s time for the double your revenue question: how do you apply data science in your organization? for this question and more, stay tuned.

I hope you’ve enjoyed reading this review! feel free to follow me, and check out our website — www.Shibumi-ai.com

If you prefer the visual medium, I recommend this resource for another intro

--

--