Set up DBT models & orchestrate DBT runs

1. Set up new Mage project
Read the setup guide to initialize a new project and start the Mage tool locally.
For the rest of this tutorial, we’ll use the project name demo_project
.
2. Set up DBT project
Install DBT
Skip this step if you’re using Mage in Docker. The Mage Docker image already has dbt installed.
-
Open Mage and go to the terminal page: http://localhost:6789/terminal
-
Add
dbt-postgres
to your project’s dependencies file (requirements.txt
) by typing the following into the terminal in your browser:echo dbt-postgres > demo_project/requirements.txt
-
Install your project’s dependencies using
pip
by typing the following:pip install -r demo_project/requirements.txt
For more information on installing DBT, read their documentation.
Create DBT project
-
Open Mage and go to the terminal page: http://localhost:6789/terminal
-
Initiate your DBT project using the
init
command (for this tutorial, we’ll use the DBT project namedemo
):cd demo_project/dbt dbt init demo -s touch demo/profiles.yml
For more information on creating a DBT project, read their documentation.
3. Create standard pipeline
- Go to the Mage dashboard and click the button
+ New pipeline
and select the option labeledStandard (batch)
. - Click the
Pipeline settings
icon in the left pane, and change its name todbt demo pipeline
, then click theSave pipeline settings
button.
4. Create DBT profile for database connections
- On the left side of the page in the file browser, expand the folder
demo_project/dbt/demo/
. - Click the file named
profiles.yml
. - Paste the following credentials in that file:
demo: target: dev outputs: dev: dbname: mage/demo host: db.bit.io password: v2_43Yy3_5LtqyXUE3ew6PrqH2y8zQyH port: 5432 schema: dbt_demo type: postgres user: demo
- Save the
profiles.yml
file by pressingCommand (⌘)
+S
. - Close the file by pressing the
X
button on the right side of the file namedbt/demo/profiles.yml
.
5. Add data loader block to pipeline
-
Click the
+ Data loader
button, selectPython
, then clickAPI
. -
In the popup dialog
Data loader block name
, change its name toload data
, then click theSave and add block
button. -
Paste the following code in that block:
import io import pandas as pd import requests from pandas import DataFrame @data_loader def load_data_from_api(**kwargs) -> DataFrame: url = 'https://raw.githubusercontent.com/mage-ai/datasets/master/restaurant_user_transactions.csv' response = requests.get(url) df = pd.read_csv(io.StringIO(response.text), sep=',') df.columns = ['_'.join(col.split(' ')) for col in df.columns] return df
6. Add DBT model block to pipeline
-
Under the data loader block you just added, click the button
DBT model
, then click the optionSingle model
. -
In the file browser that pops up, click the file named
my_second_dbt_model.sql
under the foldersdemo/models/example/
.-
This will add 2 DBT blocks to your pipeline: 1 for the DBT model named
my_first_dbt_model
and the 2nd for the DBT model namedmy_second_dbt_model
. -
The model named
my_first_dbt_model
was added to the pipeline becausemy_second_dbt_model
references it.
-
6a. Edit DBT model my_first_dbt_model
-
In the DBT block named
my_first_dbt_model
, next to the labelTarget
at the top, choosedev
in the dropdown list. You can also checkManually enter target
, and enterdev
in the input field. -
Paste the following SQL into the DBT model named
my_first_dbt_model
:WITH source_data AS ( SELECT 1 AS id UNION ALL SELECT 2 AS id ) SELECT * FROM source_data
-
Run the DBT model block by pressing the play button on the top right of the block or by pressing
Command
+Enter
. -
You should see a preview of the query execution logs. To see the query results, click the
Expand table
link at the bottom right corner.
- After previewing the results, in the top right corner of the block, click on the triple dots to reveal a dropdown menu.
- Under the dropdown menu, click the option Run model. This command
will execute the
dbt run
command and create the table in your data source.
6b. Edit DBT model my_second_dbt_model
-
In the DBT block named
my_second_dbt_model
, next to the labelTarget
at the top, choosedev
in the dropdown list. You can also checkManually enter target
, and enterdev
in the input field. -
Paste the following SQL into the DBT model named
my_second_dbt_model
:SELECT a.* , b.* FROM {{ ref('my_first_dbt_model') }} AS a LEFT JOIN {{ source('mage_demo', 'dbt_demo_pipeline_load_data') }} AS b ON 1 = 1 WHERE a.id = 1
When a DBT model depends on an upstream block that isn’t a DBT model, a source for that block is automatically added to the
demo_project/dbt/demo/models/example/mage_sources.yml
file.Read more about DBT sources in their documentation.
-
Run the DBT model block by pressing the play button on the top right of the block or by pressing
Command
+Enter
. -
You should see a preview of the query execution logs. To see the query results, click the
Expand table
link at the bottom right corner.
7. Add test for DBT model
- On the right side of the screen, click the tab labeled
Terminal
. - Create a new DBT test file by running the following command:
touch demo_project/dbt/demo/tests/test_my_second_dbt_model.sql
- On the left side of the page in the file browser, expand the folder
demo_project/dbt/demo/tests/
and click the file namedtest_my_second_dbt_model.sql
. If you don’t see it, refresh the page. - Paste the following SQL in the file:
SELECT id FROM {{ ref('my_second_dbt_model') }} GROUP BY id HAVING (id = 0)
- Read more about DBT tests in their documentation.
8. Execute pipeline end-to-end
-
Click the name of the pipeline in the header breadcrumbs to go back to the detail page.
-
Create a new trigger with a type
Schedule
and a Frequencyonce
. For more details, follow these steps. -
After your trigger is created, click the
Start trigger
button at the top of the page. -
The pipeline will eventually fail because a DBT test failed. This means everything is working as expected.
-
Open the file
demo_project/dbt/demo/models/example/schema.yml
and remove the tests namedunique
under both models. Your file should look like this:version: 2 models: - name: my_first_dbt_model description: "A starter dbt model" columns: - name: id description: "The primary key for this table" tests: - not_null - name: my_second_dbt_model description: "A starter dbt model" columns: - name: id description: "The primary key for this table" tests: - not_null
-
Click on the
Failed
button next to the pipeline run and clickRetry run
. It should complete running successfully after a few minutes.
Congratulations! You’ve created a data pipeline that orchestrates your DBT models.
Support
If you get stuck, run into problems, or just want someone to walk you through these steps, please join our Slack.