Unfolding the universe of possibilities..

Every load time is a step closer to discovery.

5 Questions Every Data Scientist Should Hardcode into Their Brain

Am I solving the right problem?

Photo by Tingey Injury Law Firm on Unsplash

Despite all the math and programming, data science is more than just analyzing data and building models. When you boil it down, the key objective of data science is to solve problems.

The trouble, however, is that at the outset of most data science projects, we rarely have a well-defined problem. In these situations, the role of the data scientist isn’t to have all the answers but to ask the right questions.

In this article, I’ll break down 5 questions every data scientist should hardcode into their brain to make problem discovery second nature.

Hammer time

When I began my data science journey in grad school, I had a naive view of the discipline. Namely, I was hyper-focused on learning tools and technologies (e.g. LSTM, SHAP, VAE, SOM, SQL, etc.)

While a technical foundation is necessary to be a successful data scientist, focusing too much on tools creates the “Hammer Problem” (i.e. when you have a really nice hammer, everything looks like a nail).

This often leads to projects which are intellectually stimulating yet practically useless.

Problems > Tech

My perspective didn’t fully mature until I graduated and joined the data science team at a large enterprise, where I was able to learn from those years (if not decades) ahead of me.

The key lesson was the importance of focusing on problems rather than technologies. What this means is gaining a (sufficiently) deep understanding of the business problem before writing a single line of code.

Since, as data scientists, we typically don’t solve our own problems, we gain this understanding through conversations with clients and stakeholders. Getting this right is important because, if you don’t, you can end up spending a lot of time (and money) solving the wrong problem. This is where “problem discovery” questions come in.

5 Problem Discovery Questions

6 months ago, I left my corporate data science job to become an independent AI consultant (to fund my entrepreneurial ventures). Since then, I’ve developed an obsession with cracking these early-stage “discovery” conversations.

My approach to getting better at this has been twofold. First, interview seasoned data freelancers about their best practices (I talked to 10). Second, do as many discovery calls as possible (I did about 25).

The questions listed here are the culmination of all my previously mentioned experiences. While it’s by no means a complete list, these are questions I find myself asking over and over again.

1) What problem are you trying to solve?

While (in theory) this should be the only question on this list, that (unfortunately) is not how things work out in practice. Many times, clients aren’t clear on the problem they need to solve (if they were, they probably wouldn’t be talking to a consultant). And even if they are, I usually need to catch up to understand the business context better.

This question helps in both cases because (ideally) the client’s answer prompts follow-up questions, allowing me to dig deeper into their world. For instance, they might say, “We tried creating a custom chatbot with OpenAI, but it didn’t provide good results.”

To which I might ask, “What was the chatbot used for?” or “What makes you say the results weren’t good?”.

2) Why…?

A natural follow-up question to “what” is “why.” This is one of the most powerful questions you can ask a client. However, it can also be one of the most difficult to ask.

Why” questions have a tendency to make people defensive, which is why having multiple ways of phrasing this question can be helpful. Here are a few examples:

Why is this important to your business (your team)?Why do you want to solve this now?What does solving this mean for your business?How does this fit into the larger goals of the business?Why do you want to use AI to solve this problem?

This question (or any of its variants) is an extremely effective way to get context from the client, which should (again) spark follow-up questions.

To continue the example from before, the client might say, “We have several support tickets that we want to categorize into 3 levels of prioritization automatically, and we thought an AI chatbot was a good way to solve that problem.” This gives much more context to the “We tried making a custom chatbot” response from before.

What are we doing?” and “Why are we doing it? are the two most fundamental questions in business. So, getting good at asking “what” and “why” can take you (very) far.

3) What’s your dream outcome?

I like this question because it (effectively) combines the “what” and “why” questions. This allows clients to speak to their vision for the project in a way that may not come through when asked directly.

To learn something new, I often need to take a few passes before it finally clicks. Similarly, I find that to really get to the root of a client’s problem, I need to ask “what” and “why a few times in different ways throughout the conversation.

This is reminiscent of Toyota’s “5 Why’s” approach to getting to the root cause of a problem. While this was developed in a manufacturing context, this is something that readily applies to problem-solving in data science.

Two related questions here are: What does success look like? and How would we measure it? These are a bit more pragmatic than a “dream outcome” but are helpful for transitioning from asking “what and why” to “how?

4) What have you tried so far?

This question starts on the path toward a solution. It does this by bringing out more of the project’s technical details.

For instance, this (typically) gives me a good idea of who will be writing the code. If they’ve already built some basic POCs, then the client (and their team) will probably be doing most of the heavy lifting. If they are starting from scratch, it might be me or sub-contractors from my network.

In this 2nd scenario, where the client has built nothing so far, one can ask a few other questions.

What is the existing solution?How do you solve this problem now?What have others done to solve a similar problem?

5) Why me?

I got this question from master negotiator Chris Voss. Who frames it as an effective way to reveal people’s motivations.

Often, this sparks additional context about what led them to you and how they see you fitting into the project, which is helpful in defining the next steps.

Sometimes, however, people don’t have good answers to this question, which may indicate that they don’t actually want to work with you and are holding back their true intentions (e.g. they want free consulting or a competing bid).

A Key Lesson

A key lesson for me these past months was to learn these questions (i.e. hardcode them into my brain) but then forget about them.

The goal is to get to where these questions naturally form in your mind in the flow of conversation. While this requires a fair share of awkward moments, it’s something that can only develop through practice (don’t worry, I’m still practicing too).

What has helped me is going into these conversations with the goal of learning. This means being curious, asking questions, and listening (much) more than talking.

Conclusion

While technical skills are required to do data science, without a clear understanding of the problem, these skills cannot provide much value. This is why developing the communication skills necessary for effectively identifying and understanding a client’s business problem is essential.

Here, I shared 5 fundamental questions for problem discovery. While this isn’t a complete list, I hope it is helpful to data folks taking on more client-facing roles.

If you have anything to add to the list, please drop those a comment 😁.

5 Reasons Why Every Data Scientist Should Consider Freelancing

Resources

Connect: My website | Book a call | Ask me anything

Socials: YouTube 🎥 | LinkedInTwitter

Support: Buy me a coffee ☕️

The Data Entrepreneurs

5 Questions Every Data Scientist Should Hardcode into Their Brain was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

Leave a Comment