Agile Data Engine - Blog

Quality management in DataOps – balancing between time to market, performance, and quality

Written by Tevje Olin | Apr 26, 2024 7:31:33 AM

Lean development and Total Quality Management are two key principles that are essential to achieving success in DataOps. But how do they work together? And how do they apply to data development? 

Getting rid of all the excess waste from the development processes is the core idea of Lean. However, there exists a delicate balance between efficiency and quality of development. As well-defined metrics based on data are the foundation of informed decision-making, we should apply the same standards to the data development as well. 

In this blog post, we will explore the intersection of Lean and Total Quality Management in DataOps and how they can help organizations streamline their development processes and ensure high-quality data products.

About this DataOps blog post series

To get an overview of the DataOps methodology and its key principles, tools, and best practices, we set out to create a series of blog posts focusing on different aspects of DataOps. 

We'll publish the blog series over the coming weeks. If you don't want to wait, you can also download the whole story as a whitepaper.

Download the whitepaper

 

Combining quality management and lean

Total Quality Management is the often-referred methodology to arrange quality management in DevOps or DataOps. The idea of Total Quality Management is to involve everyone in the quality improvement process and make it a continuous part of the process in the product's life cycle.

We will not go through the details of Total Quality management in this post. Still, we want to point out a couple of key principles that should be applied in data development and operation quality management as well as the actual data product quality management:

  •  As in Lean, there should be continuous improvement and elimination of waste of the products and processes
  • The optimization should be done on the whole process from design to delivery rather than just individual parts of the process
  • The decisions on the optimization should be based on data and metrics (how can you even expect to optimize something without understanding it and without proper facts?)

These principles overlap with Lean principles, and from the Lean point of view, we should focus on value by prioritizing the backlog.

We claim that quality management and lean go tightly hand in hand. In DataOps, we can analyze quality management and Lean from two perspectives: 

  • What is the quality and how lean is the development process of the data products
  • How lean are the data pipelines, and what is the data quality of the product

Streamlining the development

From the development perspective, the idea is to continuously measure and analyze the process and make improvements and adjustments accordingly. When talking about the data development process, it is rather ironic if the development processes are not actually analyzed and optimized based on data and metrics. However, the weight on the optimization should not only be on the throughput of the team. As said, on a higher level, there is the quality present as well. 

It is not the easiest task to optimize the throughput and quality in a scalable cloud environment where the resources scale and the costs follow. As the engineering work also has a price tag, it is a delicate balance between time to market, performance, and scalability of the data pipelines and products – not forgetting about the Ops part with the maintenance costs. In the end, what we would like to see is a productive DataOps team and a well-working data platform delivering value with predictable run and development costs. The more standardized development patterns you have in place, the more predictable development and running costs you will have on your platform!

One solution for streamlining or speeding the development is to automate as many parts of the development process as possible. Pattern-based development methodologies and architecture make automation usually more effective. However, you should always pay attention to the effort-gain ratio when starting to automate tasks; if you save one day of work by using ten days to automate the task, is it worth it? 

Balancing with the data quality

From the pipeline or product perspective, the focus should be architecturally consistent, robust, and trustworthy data pipelines and products that perform within the requirements. And then there is data quality.

Data quality can, of course, be measured and monitored as part of the DataOps, but data quality issues rarely have anything to do with data development. Mostly the quality issues of the business applications became visible on reports or analytics built on top of the data platform.

Some of the data quality issues can be addressed on the data platform, but you should always keep in mind that the correct place to fix the data quality issues is where the data is produced. The further downstream you go, the higher cost and the less impact the fix has. From the operations perspective, these and all other types of fixes should be done in a way that repairs the whole issue, not just the symptom.

You can read more on data quality management from our whitepaper: Data Quality Monitoring with Agile Data Engine. 

Summary

To achieve a harmonious balance between quality management and lean principles, organizations should adopt a holistic approach. By integrating Total Quality Management  and Lean methodologies into DataOps practices, we can foster a culture of continuous improvement and waste elimination throughout the product's lifecycle. Embracing data-driven decision-making and prioritizing value in the backlog, we can streamline development processes and enhance the efficiency, effectiveness and predictability of data products.