Agile Data Engine - Blog

Find Simplicity: Navigating the Maze of Schema Changes in Cloud Data Warehouse

Written by Christoph Papenfuss | Apr 26, 2024 7:33:38 AM

The devil is in the details 

What is the famous saying? “The devil is in the details.” There is a lot of truth in this idiom. How often have we started something seemingly simple only to find out that it is complex. I vividly remember spontaneously deciding to fix the brakes of my son’s bicycle. Despite never having worked with those types of brakes before, I was confident in my skills. What could go wrong? Having watched a few videos, I finally got started. 20 minutes should do. Half-way through the process I realized that my toolset wasn’t complete. This completely derailed my flow, and I ended spending two hours trying make use of the tools that I had on hand. Little forethought, big agony.

Who is afraid of schema changes?  

Schema changes in a complex cloud data warehouse also seem simple on the surface. How difficult can it be to add/ delete/ modify a column? Based on our experience, these type of things can end up creating a lot of hassle for any data team regardless of its experience level. You have to be especially careful, if you are operating in an environment with little automation. It is wise to go through a short checklist before you get started: 

 

  • Ensure that your team is intimately familiar with the data model. Table design, load patterns & dependencies need to be clearly understood.
  • Carefully analyze & understand your data loads. A few simple mistakes in your change strategy can lead to potentially costly reloading of data.
  • Set time aside to develop the required SQL statements – even simple schema changes can easily require long & complicated code. You don’t want to be rushed & interrupted.
  • Communicate & coordinate with the other data engineers. You do not want to cross paths and interfere with the development work of your colleagues. A data model is typically shared across a plethora of personas & roles.
  • Find the right time to deploy the updates to the different environments. Some changes might even require downtime in some cases. You do not want to interrupt the ongoing work.
  • Leverage tight version control. You absolutely have to understand the various code iterations you are going through. This becomes critical in case things go wrong.
  • Last but not least, don’t forget to document everything that you are doing. Let’s face it – documentation is hardly ever fun. But it’s essential to keep the investment in your cloud data warehouse safe & secure.

 

A better way to make schema changes

Agile Data Engine aims to dramatically simplify and streamline standard maintenance tasks such as schema changes. Take a look at the following video to get a glimpse of what our platform can do for you and your team.

 

 

Benefits of Agile Data Engine in schema changes

Schema changes don’t have to be scary. The Agile Data Engine platform provides guardrails around the change process. It allows your team to focus on the logical level of the proposed changes rather than having to worry about the technical details. Our credo is: “Code less & think more”. In addition, you are able to benefit from the continues integration & deployment features (CI/ CD) which in effect allow you to deploy any kind of change without disrupting the work of your colleagues. Documentation is automatically created, and version are safely stored. All this offers you tremendous benefits: 

  • Significant reduction of errors
  • Improved communication & collaboration
  • Significant time savings
  • Quick results for the business
  • Less downtime
  • Hassle-free versioning
  • More time to focus on the important things

 

Bonus Benefits when doing cloud data warehouse schema changes

There is one bonus feature that we don’t speak about in the video: the SQL code can be generated across multiple different cloud data warehouse providers. You can easily try the code in different environments such as Snowflake, Databricks or Big Query amongst others. Imagine you had to do this manually – the differences between SQL dialects are subtle and getting the code right could take a lot of debugging. And let’s not even speak about keeping all of this in sync…..

 

Those pesky details…

Schema & data model changes do seem oh so simple on the surface but they are tricky. Don’t be fooled by the nitty gritty details that are often hiding in the shade of the more glamorous tasks.

 

And what about the brakes of my son’s bike? Well, it did not end well and I had to pay a handsome amount to a local repair shop. Lesson learned.

Please get in touch with our passionate Agile Data Engine team, if you want to learn more.