Practices
The three principles of DataOps Management come together in two main practises:
1. Data as a Product thinking
2. DataOps
practice 1:
Data products
The development and governance culture is still, deep down, based on project thinking in many companies. In project thinking, the success of work is measured by delivering specified outputs, not by how well the solution fulfills the business needs now and also in the future.
It is critical to change the mindset from completing a development backlog item to building something that can solve also future business problems without constant need for rebuilding everything. Rather than focusing on the output, product thinking is focused on the outcome.
The other needed mindset change in data industry is to shift the focus from data pipelines to data products.
Thinking and managing data as a product is the answer to these mindset changes.
Treat data users as valued customers
Data as a product is about thinking and doing things in a customer-oriented way. Consumers of the data should be treated as customers, whether they are external or internal. This is a simple but very powerful mindset change.
Valuable and adaptable
Data product must be valuable to its users — on its own and in cooperation with other data products. It must solve a set of specific business needs and problems, and it must also be able to adapt to future business needs. Although, data products don't need to be monetized necessarily, it is useful to think whether the customer be ready to pay for the product.
Trustworthy and secure
Lack of trust from users is one of the top reasons why data platforms fail. To be trustworthy, data product needs to be accurate, reliable, and unbiased. Besides the quality of data content, the data product should have clear documentation and processes for maintaining and updating it over time. There should be necessary access control and authorization in place, and It should adhere to required privacy policies, like GDPR. As part of documentation, it is important to codify the customer expectations and elements of trust into a service level agreement. This way there is a continuous system of ensuring the trust of customers.
Discoverable and understandable
If the user does not even know about the existence of a data product, or cannot find it easily, it cannot create any value. It must also be understandable, which means good design and documentation. Documentation must describe data content, semantics of the data, as well as the technical syntax. Discoverability and understandability are major factors in user experience of a data product.
Continuously managed and improved
Data products rarely stay unchanged over their lifecycle. New data is coming in all the time and there will be changes in the source data and business needs are changing. To ensure the quality and user experience while everything changes, it is critical to continuously measure these aspects and enable continuous improvement of data products.
practice 2:
Dataops
DataOps is in the heart of Agile Data Engine and our philosophy. DataOps is a hybrid approach combining ways of working and technologies. It is still an emerging thing and also a bit hyped.
There are several available definitions for it and they differ a bit depending on the perspective and motive of a defining party. Here are a few good ones:
Gartner:
DataOps is a collaborative data management practice focused on improving the communication, integration, observability and automation of data flows between data managers and data consumers across an organization.
Forbes:
DataOps is a methodology that enhances the quality and reduces the cycle time of data analytics through better communication, collaboration, and automation of processes. It emphasizes continuous integration and delivery, automating data pipeline processes, and implementing policies and processes to ensure high-quality data.
Julian Ereth, DataOps - towards a working definition:
[DataOps is a] Set of practices, processes and technologies that combines an integrated and process-oriented perspective on data with automation and methods from agile software engineering to improve quality, speed and collaboration and promote a culture of continuous improvement.