Lean development and Total Quality Management are two key principles that are essential to achieving success in DataOps. But how do they work together? And how do they apply to data development?
Getting rid of all the excess waste from the development processes is the core idea of Lean. However, there exists a delicate balance between efficiency and quality of development. As well-defined metrics based on data are the foundation of informed decision-making, we should apply the same standards to the data development as well.
In this blog post, we will explore the intersection of Lean and Total Quality Management in DataOps and how they can help organizations streamline their development processes and ensure high-quality data products.
To get an overview of the DataOps methodology and its key principles, tools, and best practices, we set out to create a series of blog posts focusing on different aspects of DataOps.
We'll publish the blog series over the coming weeks. If you don't want to wait, you can also download the whole story as a whitepaper.
Combining quality management and lean
Total Quality Management is the often-referred methodology to arrange quality management in DevOps or DataOps. The idea of Total Quality Management is to involve everyone in the quality improvement process and make it a continuous part of the process in the product's life cycle.
We will not go through the details of Total Quality management in this post. Still, we want to point out a couple of key principles that should be applied in data development and operation quality management as well as the actual data product quality management:
These principles overlap with Lean principles, and from the Lean point of view, we should focus on value by prioritizing the backlog.
We claim that quality management and lean go tightly hand in hand. In DataOps, we can analyze quality management and Lean from two perspectives:
From the development perspective, the idea is to continuously measure and analyze the process and make improvements and adjustments accordingly. When talking about the data development process, it is rather ironic if the development processes are not actually analyzed and optimized based on data and metrics. However, the weight on the optimization should not only be on the throughput of the team. As said, on a higher level, there is the quality present as well.
It is not the easiest task to optimize the throughput and quality in a scalable cloud environment where the resources scale and the costs follow. As the engineering work also has a price tag, it is a delicate balance between time to market, performance, and scalability of the data pipelines and products – not forgetting about the Ops part with the maintenance costs. In the end, what we would like to see is a productive DataOps team and a well-working data platform delivering value with predictable run and development costs. The more standardized development patterns you have in place, the more predictable development and running costs you will have on your platform!
One solution for streamlining or speeding the development is to automate as many parts of the development process as possible. Pattern-based development methodologies and architecture make automation usually more effective. However, you should always pay attention to the effort-gain ratio when starting to automate tasks; if you save one day of work by using ten days to automate the task, is it worth it?
From the pipeline or product perspective, the focus should be architecturally consistent, robust, and trustworthy data pipelines and products that perform within the requirements. And then there is data quality.
Data quality can, of course, be measured and monitored as part of the DataOps, but data quality issues rarely have anything to do with data development. Mostly the quality issues of the business applications became visible on reports or analytics built on top of the data platform.
Some of the data quality issues can be addressed on the data platform, but you should always keep in mind that the correct place to fix the data quality issues is where the data is produced. The further downstream you go, the higher cost and the less impact the fix has. From the operations perspective, these and all other types of fixes should be done in a way that repairs the whole issue, not just the symptom.
You can read more on data quality management from our whitepaper: Data Quality Monitoring with Agile Data Engine.
To achieve a harmonious balance between quality management and lean principles, organizations should adopt a holistic approach. By integrating Total Quality Management and Lean methodologies into DataOps practices, we can foster a culture of continuous improvement and waste elimination throughout the product's lifecycle. Embracing data-driven decision-making and prioritizing value in the backlog, we can streamline development processes and enhance the efficiency, effectiveness and predictability of data products.