Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

How to design a dbt model from scratch

Taylor BrownlowFollowTowards Data Science--ListenShareWhen I was researching the Ultimate Guide to dbt, I was shocked by the lack of material around actually building models from scratch. Not the exact steps to take in the tool — that is all covered in innumerable blogs and tutorials. I mean how do you know the right design? How do you make sure your Stakeholders will use that model? How can you make sure it will be trusted and understood?When we deploy new models without taking these steps, there can be significant consequences:If we repeat this process over and over trust between data and business teams begins to deteriorate as each side gets progressively more exhausted from this feedback frenzy, something that can be very challenging to build back up.This underscores the importance of thinking carefully about how we design models, not just on our own in dbt, but collectively with all of our stakeholders, to make sure the Model is accurate and effective, and we don’t waste our time building each model 4–5 times before its useful.This article is the result of research and experiments into how best to design and implement a dbt model. It won’t have any commands to execute in dbt, but it will talk through how to think about your model, and how to structure your workflow to make sure you’re not wasting your time.Lucky for me, I’m not the first to think about this problem. Many other fields have faced similar challenges and have created their own frameworks and processes that I can leverage when thinking about how to approach data modeling. For example:Agile principles discourage software engineers from a waterfall development approach which is antithetical to an environment of rapidly-changing requirements [1]. Instead, Agile embraces rapid iteration and acknowledges the competitive advantage of being able to respond to changing requirements quickly.Design principles similarly acknowledge the need to be deliberate about how you work with multiple stakeholders on a design project [2]. The framework prioritizes people and encourages feedback and each stage of development so the best solution can be found as quickly as possible.Even the data modeling godfather Ralph Kimball nods to the importance of getting quality input from stakeholders early in a modeling process in his 4-step process to data modeling [3]. Step 1 of which is to go learn as much about the business process as you can before you even think about building a model.However, the most influential source I found when thinking about this problem was the System Engineering Heuristics — a set of truisms about working on a complex problem with many stakeholders [4]:These sources helped shape the following process for designing data models from scratch.And so I wanted to build a process that was true to those principles, that was repeatable, and that would actually make sure my models were built well the first time.Here’s what I came up with:We’ll walk through each step in more detail below.The following examples will show screenshots from count.co, a data canvas, where I am Head of Product. It’s important to note, however, this process is tool-agnostic. You can follow along with the example in the screenshots here.Objective: Understand the business process you are modeling.Players: You, Business stakeholdersActivities:Objective: Map out possible approaches to building your modelPlayers: YouActivities:Objective: Map out how you’re going to get to the agreed-upon final table. Include code and explanation.Players: You, members of the data team, stakeholdersActivities:Objective: Deploy model in dbtPlayers: YouActivities:Objective: Let stakeholders know the table is now available and how to interact with the tablePlayers: You, Business StakeholdersActivities:Try this process the next time you start building a dbt model from scratch. It will be a big change for both you and your stakeholders, but it has proven to significantly decrease the time taken to deploy new models and improve the overall uptake of those models.The simple act of bringing more people into your data modeling process, and demonstrating transparency helps to promote trust and deliver valuable data models quickly.And if you do give this a spin, please drop me a comment and let me know how it went and any ideas for improvements! These things have to be constantly iterated after all…[1] Agile Manifesto. (2001). Principles behind the Agile Manifesto. Retrieved July 1, 2023, from https://agilemanifesto.org/principles.html[2] Design Council. (2004). Framework for Innovation. Retrieved July 1, 2023, from https://www.designcouncil.org.uk/our-resources/framework-for-innovation/[3] Holistics. Kimball’s Dimensional Data Modeling. Retrieved July 1, 2023, from https://www.holistics.io/books/setup-analytics/kimball-s-dimensional-data-modeling/[4] Peter Brook. “Systems Enegineering Heuristics.” in SEBoK Editorial Board. 2023. The Guide to the Systems Engineering Body of Knowledge (SEBoK), v. 2.8, R.J. Cloutier (Editor in Chief). Hoboken, NJ: The Trustees of the Stevens Institute of Technology. Accessed [DATE]. www.sebokwiki.org. BKCASE is managed and maintained by the Stevens Institute of Technology Systems Engineering Research Center, the International Council on Systems Engineering, and the Institute of Electrical and Electronics Engineers Systems Council.----Towards Data ScienceProduct @ Count (https://count.co)Taylor Brownlow--Miriam SantosinTowards Data Science--14Dominik PolzerinTowards Data Science--19Taylor Brownlow--1Katherine MunroinTowards Data Science--3Xiaoxu GaoinTowards Data Science--1Rebecca MottinLeadUp--5The Useful TechinMac O’Clock--21Rebecca MottinLeadUp--13Alexander Hilton--HelpStatusWritersBlogCareersPrivacyTermsAboutText to speechTeams



This post first appeared on VedVyas Articles, please read the originial post: here

Share the post

How to design a dbt model from scratch

×

Subscribe to Vedvyas Articles

Get updates delivered right to your inbox!

Thank you for your subscription

×