What does a Product Data Scientist do?

Anastasia
12 min readAug 22, 2021

--

Every day I wake up, make myself some coffee, open the laptop and … start working I guess?

What do I do?

I am a (Product) Data Scientist, working at a big tech company that releases primarily digital products. (a.k.a. a pretty big and famous music streaming service. All opinions are my own and are not reflecting the views of my employer.)

My role is actually a Data Scientist, but I prefer to add the prefix ‘Product’. It highlights the important part of my role, that I work with product development teams who build features, usually for digital products.

Let me explain.

Product development teams usually have people of a few different roles, working together to build and improve the product. Not all companies have those, in some companies, the roles may be called differently, in some — some roles are not embedded within the team, and so on. This is a generalization based on my previous experience. And of course, this is not an accurate representation of the ratio of different roles in the team. Often there are more Developers than Designers, and sometimes teams would share a Data Scientist with other teams.

Typical roles in a Product Development team.

All these people work towards the goal of improving the product. Identify problems or opportunities, decide on how to try solving them, experiment, analyze, and make a decision.

Different roles within the team take ownership of each of those stages.

Roles’ responsibilities for each of the stages of the cycle.

Product Data Scientists take ownership of the A/B test analysis stage and consult the team on the other stages.

Roles’ responsibilities for each of the stages of the cycle.

This doesn’t explain how Product DSes spend their time, right?

I tried splitting the types of tasks we actually spend our time on into 3 categories.

Knowledge

Whether you are a Junior DS just out of uni or a senior researcher, knowledgeable in the ways of the company — you will spend quite a lot of time in this area. Acquiring and sharing knowledge. This is the part that’s the easiest to skip or deprioritize; it doesn’t immediately give a tangible outcome, doesn’t show your colleagues that you are productive, doesn’t impact the KPIs, and doesn’t unlock other people in finishing their projects.

But it is the most important investment into your value as an employee, an expert, and a team member.

Depending on the size of the company, state of the product, and culture of your team — you may spend from 0 to 50% of your time learning something or teaching others. This will fluctuate a lot during the tenure and change based on your goals and priorities.

Acquiring

A great sign of good company culture is when you feel like you have time and energy after your regular day-to-day tasks to learn something new. When not only your manager tells you it’s great if you do that, but also your teammates consistently take some time to learn a new skill or try out a new unproven method.

Usually, most new skills you’ll learn and improve on the job, through the projects you are working on. But 5%-30% of new skills can be acquired through attending workshops, conferences, doing courses, reading books, and working on projects which do not deliver immediate value to the team and company.

Learning about the business goals, mission, plans, and challenges of the other areas in the company is also quite crucial. Especially, if you are interested in taking on new projects, being proactive, and getting promoted. This expands your horizons and allows you to understand the business needs faster and better.

Sharing

As corny as it is, “sharing is caring ❤️” also in the corporate world. It contributes to breaking siloes within the company, avoiding too much of the same work being done in different areas of the company simultaneously, and generally contributing to a fun and interesting company culture.

Whether you are sharing the results of your research with a product team, or organize an internal conference for the data function — both ways to share knowledge are an important step in building ‘data-informed’ company culture. Spending time on creating better explanatory content, doing workshops on new tools and concepts, sitting down with junior members to review their analysis, mentoring colleagues — not only will be useful for the company. It is how you learn and become genuinely happier by helping others.

Analytics

The core work of any DS, Analyst, ML engineer, or Researcher. Looking at and making sense of data. Preparing data, creating data, analyzing data, using data, and building tools and products around data.

If the other two areas are generic enough and do appear in many other roles, this one is why you were hired.

I like to see it as a flow of different stages you need to go through to understand what data you have and how you can use it. Depending on the role, you’ll have different responsibilities in the data lifecycle stages.

This is the typical data lifecycle in a company with fairly advanced data infrastructure. Of course, this differs a lot depending on the company and the role responsibilities can vary.

As a Product Data Scientist, my core responsibilities are understanding how the data is created, stored, aggregated, cleaned, and how it can be used to inform decision making. I find it crucial to include the first steps, which seemingly fit a more Data Engineering role, but provide incredible insight on the trustworthiness and reliability of your sources. So a considerable time of my working hours, especially when I just start working at a company, is spent on all things ‘Data Munging’.

Data Munging

I’d say most of my time is spent here, learning and relearning how the data was created, is stored, aggregated into metrics, and used for analysis.

Don’t ever stop questioning your data — is the rule I pretty much live by.

Find the most reliable tables for the key metrics in your company, meticulously learn how they were calculated; I even go as far as always checking the source code for those tables.

Question your peers as to why things were done in a certain way, and always compare your own calculations to those. Things change, code gets deprecated, libraries need to be updated, and all those things can impact your top-line metrics.

The way you calculated something a year ago may be wrong or misleading today. Or there may be a better, faster, more precise way.

Cleaning, aggregating, and pipelining your data from raw to more usable sources is also a big chunk of DSes work. Takes a great deal of business understanding and probably quite a bit of experience in a specific company to understand how should you prepare your data sources so that they are the most usable on a frequent basis for you.

Statistical Analysis

This is probably what most people think 100% of Data Scientist’ work should be. Because this work reflects the most specific required hard skills for the job.

Knowing statistics, data analysis techniques, modeling, and now more frequently even building ML models and doing AI research.

The reality is that most companies won’t have an immediate and major need for someone who uses advanced statistical concepts on a day-to-day basis. But they do want to hire someone who could do it when needed. Personally, I think that’s why there is such a big gap between the expectations of the freshly graduated DSes and the reality of work requirements.

The Data Scientist role is 90% preparation. Without the prep work, the results can be unreliable.

Most of my work is focused on understanding hypothesis testing and being able to design, run, and evaluate A/B tests. Yes, all the p-value questions asked so often in the interviews come to play here. A deep understanding of those concepts is required.

However, I rarely have a real need to do significance testing by hand. Most companies, which run more than 50–100 A/B tests a year automated the analysis part. It is the easiest part to automate honestly. Running a statistical method on a dataset in R or Python is easy. Knowing how, when, in which conditions, you can run an A/B test is the hard part. And it’s focused much more on the business realities than knowing advanced statistics.

Occam’s razor or the law of parsimony applies well here. The easiest solution is often the best one.

A more fun and in-deep work come into play when there is a business need to do data analysis and research. This is where more advanced knowledge of statistical methods is often required.

Answering the questions like:

  1. What are the behavioral patterns of different types of our users? (Usually answered working with User Researchers)
  2. What impacted certain metrics of our product and how much?
  3. How to understand more long-term trends in the development of our product usage?
  4. How to predict the future of certain metrics?
  5. How do we know what our users like or dislike?
  6. How can we tie behavior to metrics change?
  7. What is the level of importance of external events on our product performance?
  8. What are the opportunities that we could address to improve certain metrics?

And so on. Most of the perceived “real” DS work lies within those questions.

Ad hoc

Probably the least fun and most energy-draining part of the work is ad hoc questions for specific data points or visualizations. Usually happens more often in companies with less developed data infrastructure. Where stakeholders have limited access to dashboards with comfortable UI and pre-aggregated datasets. Could take a considerable amount of time, especially during some reporting periods, if I haven’t invested time first to build the dashboards.

Peer-review

Another sign of great company culture is a consistent, respectful, and convenient peer-review process. Not only of pull requests to critical production repositories but also everyone’s contribution to conducting readable and reproducible research.

When you can ask your colleagues without any fear of judgment to review your code or analysis methodology, and they feel like they won’t be penalized for spending time on it, provide you thoughtful and respectful feedback and help when needed — I think you’ve made it. It’s a great place to stay at.

Meetings

The most laughed at and potentially dreaded area of work. Especially now with zoom fatigue and all. Everyone jokes that they have too many meetings and they should be canceled, and yet everyones’ calendars are constantly filled up with different types of meetings.

For me, as a Product DS, there are generally 4 types of meetings. And of course, this is intertwined with the other 2 areas. Both in analytics and even more so in knowledge — many things are solved through meetings.

Analytics team meetings

“Analytics team” is a generic name that just means the team of your function you belong to. Product Insights, Business Performance Unit, Data Analytics, Data Science, etc.

Pretty much a team with your functional peers (a.k.a the same function as you) working in the same(ish) area. Whether you have this team or not depends mostly on the size of the company.

Depending on how close you work with each other, you might have the following meetings (I have most of those in various cadences):

  • Standup — short and sweet (at least it should be) frequent updates about the things you are working on and plans ahead. (2–5 times a week)
  • Planning — less frequent and longer meetings where you define what your team and you will work on in the next month, quarter, year. (weekly, biweekly, monthly, quarterly)
  • Retro — look back on what went well, what could be improved, and how to improve it. (biweekly or monthly)
  • Knowledge sharing — usually sharing the results of your work and asking for feedback/collaboration. (monthly or when needed)
  • 1on1s — most often with your manager to sync on your progress and needs, or do your yearly performance review. (weekly, biweekly, twice a year during the performance development process)
  • Workshops — sharing more in-depth knowledge of some specific tool or area. Usually hands-on. (when needed)
  • Socializing — (fika in Sweden) breaks where you get to chat about whatever and get to know your colleagues better. (weekly or when needed)
  • Adhoc — anything from the announcement of reorganizations to celebrating someone’s birthday.
  • Offsites — rare (virtual or not) get-togethers to have a longer period of reflection on how your team is doing. Longer planning sessions, longer retros, some icebreaker exercises which are usually finished by dinner and drinks in a nice restaurant, but now — you just need to close your laptop lid.

Product team meetings

The multifunctional team, with different roles being responsible for the different parts of the product lifecycle. As shown above in more detail 👆.

  • Standup — same as in your analytics team, except usually for DSes it’s not necessary to join all standups, you simply won’t have much to update on every day. And most of your work is done separately from the team anyway.
  • Planning — what will the team build in the nearest future. Important for a DS to join to see what will you need to focus on as well. Also, you could inform the team of the important data availability and quality work they might need to do for the features.
  • Workshops/Brainstorms — usually to decide how the team could address a hypothesis through building new features or adjusting existing ones.
  • Retro — same as above except the need to join this depends on how embedded you are in the team’s work.
  • Feature testing — making sure the feature the team built is working right. I usually join those meetings to test data tracking and make sure it’s on point and we will be able to drive conclusions from what we see.
  • Feature release — pushing that button! Also starting an A/B test.
  • Analysis sharing — Sharing the results of an A/B test, or a deep-dive analysis. Very important to ensure that the team understands the metrics, and can draw conclusions.
  • Ad hoc and 1on1s — randomly or infrequently appearing meetings.

Analytics function meetings

If your company is big enough to have multiple “analytics” teams in different product areas, then you get to experience some meetings with a bigger audience.

  • All-hands
  • Knowledge sharing — same as above, just more people. You also get to learn what people you don’t interact with on a regular basis actually do.
  • Workshops — also same as before, but with more people.
  • Sync on the work in progress — sort of standup, but longer and less frequent. You get to know what people are working on right now and not when they finished it (like in the Knowledge sharing section).

Company-wide meetings

  • All-hands/Announcements/CEO-calls — Depending on the size of your company this could multiply in many meetings of various frequencies. Ask your CEO some questions (for example why vanilla coke was removed from the fridges of one of the offices — yup, that actually happened) and see announcements a few days prior to an official press release.
  • Company offsites/Conferences — In the times of working from offices, this is where you get to fly to another city/office, get together will all the colleagues, listen to announcements from the audience in some theatre hall, go to workshops, learn some new things, maybe present something, and potentially party with all of the consequences of that. Now it’s a multiple-day-long virtual meeting. But hey, at least you don’t have to share a hotel room with a random colleague.

Summary, I guess.

Knowledge work in a complex system with many pieces of a puzzle cannot be reduced to a simple one-liner.

HR doesn’t just hire people. Artists don’t only draw. Developers don’t only write code. Data Scientists don’t only do statistics and ML.

There is a lot of work and learning done in the background. Shadow work. Busy work. Communication, collaboration, explanation, preservation, etc.

Some of this work you don’t even know you’ll have to do when you start at a new place. Even if your role is the same as before. And a lot of it depends on you. Whether you fall into the work that’s given to you or carve your own path to do the work you love.

The whole chart for reference.

Here’s a video I made on the same topic if you prefer listening to reading.

Thanks!

--

--

Anastasia

Data Scientist who believes everyone should know how to use and analyse data. Working in fintech, living in Stockholm.