5 Steps to Managing Remote Data Science Teams

Introduction

As enterprises begin to reopen post-pandemic, most are recognizing the reality of managing a remote/distributed workforce. Across the globe, businesses are reevaluating how they operate in this emerging new normal. Every business function must take appropriate steps to assess how they will: (a) maintain operations continuity, (b) promote business performance, and (c) nurture creativity and innovation. This is even more relevant to enterprise data and analytics leaders as they find their teams sitting at the intersection of these three ingredients for success.

We begin our learning with reflecting on what it means to be a data scientist or machine learning engineer. A Google search for “data scientist” will point you to a number of resources that speak to the unique blend of competencies these professionals are expected to possess. It should come as no surprise then that managing a virtualized data science team must account for nurturing and developing these competencies. This article showcases five essential elements of successfully managing virtualized data science and machine learning teams.

5 Habits of Successful Remote Data Science Teams

Habit #1: Cultivate Domain Expertise

While visionary leadership at the helm is important, it is equally desirable to have capable, competent domain experts leading the grassroots data transformation efforts across the enterprise. A dependency on a single leader is simply not sustainable or scalable, and increasingly becomes a challenge in the context of a virtualized workforce. As a first step in this direction, leaders must align on their strategic objectives. My colleague Kathleen Bot has done extensive work in this space helping enterprises secure organizational alignment.

An essential component of sustaining that alignment is achieved by embedding SMEs within business functions that are strategically prioritized for transformation. The SMEs work closely with their respective business partners while leveraging the scale provided by their primary home: the data/analytics center of excellence (DA CoE). In a virtualized setup, such a federated CoE is all the more important. The embedded SMEs develop domain expertise and partnerships within their respective business functions, thus cultivating a critical skill set for data science success.

Habit #2: Establish a Technology Strategy

Data science teams are often at the forefront of evaluating, piloting, or deploying some of the latest technologies in the space. Within a virtualized team setup, it is far too easy for this enthusiasm to result in a scattered toolset that does not appear to be driven by purpose. Aside from perceptions, teams run the risk of incurring maintenance and support costs related to every new technology that is onboarded. When costs “bubble to the top”, firms have been known to tap into consultancies to help them rationalize their technology strategy.

Now more than ever, virtualized data science teams should be guided by a purposeful platform strategy. Achieving this is easier said than done, since technology is advancing at a rapid pace as well. Hadoop, Spark, and cloud services are perfect examples of how technology evolution accelerated the market consolidation for some, while at the same time opened doors for others. A technology strategy should be guided by the principles of flexibility, scalability, portability, and support for the end-to-end data science pipeline. This will serve virtualized teams well in the long-term, while also managing maintenance and support costs of the related data analytics services. In the absence of a technology strategy, virtualized teams will not be able to clearly define their technology related professional development goals, leading to employee dissatisfaction.

Habit #3: Promote a Culture of Operationalization

Too often we see customers approach data science as a portfolio of science experiments. Without an understanding of what it will take to convert a proof of concept to an operational solution, enterprises will struggle to justify their investments in data and analytics.

Begin with defining operationalization success criteria for each use case and adopt an agile approach (e.g. SAFe) to manage how use cases move through your innovation funnel. At each stage gate, ensure you have a clear line of sight into what it will take from that point forward to operationalize the use case. As a catalyst to this agile framework, pair up your data science resources working on each use case. Pairing up resources achieves several outcomes aside from improving the use case operationalization success rate.

First, it serves as a novel method to train and coach junior resources. Second, it encourages the sharing of best practices and lessons learned. Third, it can formalize the best practices into solution delivery accelerators and reusable code kernels that will reduce future project delivery costs. Fourth, it introduces every team member to the full lifecycle of a data science project. From data ingestion and preparation, to data exploration and feature engineering to model training and deployment, to operationalization and ML DevOps. We often find that the use case operationalization success rate is directly tied to a team’s competency to tackle the end-to-end pipeline. Developing such competencies takes time and requires at least a few success stories that cover the gamut of data science pipelines.

Habit #4: Market the Team’s Contributions

A virtualized team elevates the need for marketing its contributions and successes. Technical brown bag sessions are one way to get your team to learn more about what their peers are engaged with. They also provide an excellent vehicle to promote ideation as well as foster camaraderie in the age of telecommuting. Ideally, these sessions should be conducted in an informal, casual style, with the goal of providing an alternate to the water-cooler or hallway conversations colleagues might engage in. It also helps junior team members get comfortable communicating and presenting their ideas in a video conference environment.

Beyond internally focused sessions, we also recommend external partnership-focused communications. These sessions typically bring the broader enterprise together and throw a spotlight on specific data-driven analytics products or services developed by the team. Besides serving as an excellent marketing vehicle, they also help position senior team members for that critical domain lead functions discussed in Habit #1. Finally, there is nothing better than getting business partners to co-present with your team at such events, thereby securing an unbiased advocate of your team’s capabilities. Here is a relevant example of the latter, that showcases recent work of Inspired Intellect’s data science team working closely with a technology partner.

Habit #5: Sponsor Innovative Research

Last but not least, a virtualized workforce across the enterprise (and not just within the data science/machine learning function), may mean that not every advanced analytic use case or opportunity is delivered directly to you. Occasionally, you may have to seek out business sponsors by demonstrating relevant proof of concepts. While we always emphasize that research initiatives should be motivated by delivering business value, when appropriately designed they can also serve as an educational medium for the team.

The team at Inspired Intellect engaged in a similar exercise recently. They recognized the opportunity to employ public domain COVID19 infections and mortality data to help supply chain dependent customers get a better understanding of how the pandemic was influencing their verticals. That effort not only showcased the team’s innovation talent with sourcing and extracting value from the external data assets, but also brought to the forefront the importance of alternate data sources in a world where internal assets were yielding anomalous predictions. In other times, customers’ internal historical data assets would have sufficed to support business decisions. However, that is far from true in the new normal. If the team had waited for customers to reach out to them, they would have been behind the curve in terms of understanding what it will take to deliver such capabilities. Are you thinking out-of-the-box in regard to how your team can catapult your enterprise ahead of your competition, or are you waiting for your business partners to reach out to you? Worse, are you still waiting for your competitors to lead the way?

Conclusion

The post pandemic environment presents unique challenges for businesses working remotely, particularly data science and machine learning teams. However, there are several positive aspects that the new normal brings, specifically in the context of developing the unique talent of data science and machine learning teams. Last, but not the least, the post-pandemic world may in fact accelerate digital and data transformations within an enterprise. Ensure that your team has the domain expertise, people, processes, and technologies in place to help your enterprise achieve scale in their transformation journey.

About the Author

Brian Monteiro serves as Chief Scientist for Inspired Intellect, a data and analytics technology consulting practice, headquartered in Frisco (DFW metroplex), Texas. Brian has over 25 years professional experience in leading, delivering, developing and innovating with data and advanced analytics. Follow Brian on LinkedIn at https://www.linkedin.com/in/monteirobrian/.