Run dbt-spark against a pySpark session.
The following procedure demonstrates how to run dbt-spark
with a pySpark session.
-
Build a Mage docker image with Spark following the instructions given at Build Mage docker image with Spark environment.
-
Run the following command in your terminal to start Mage using docker:
- Create a new pipeline with a name
dbt_spark
, and add aScratchpad
to test out the connection with PySpark, with the following code:
It should return results similar to the following when running:
- Click the
Terminal
icon on the right side of the Mage UI, and create a dbt projectspark_demo
, with the following commands:
- On the left side of the page in the file browser, expand the folder
demo_project/dbt/spark_demo/
. Click the file namedprofiles.yml
, and add the following settings to this file:
-
Save the
profiles.yml
file by pressingCommand (⌘) + S
, then close the file by pressing the X button on the right side of the file namedbt/spark_demo/profiles.yml
. -
Click the button
dbt model
, and choose the optionNew model
. Entermodel_1
as theModel name
, andspark_demo/models/example
as the folder location. -
In the dbt block named
model_1
, next to the labelTarget
at the top, choosedev
in the dropdown list. You can also checkManually enter target
, and enterdev
in the input field. -
Paste the following SQL into the dbt block
model_1
:
Click the Compile & preview
button to execute this new model, which would
generate the results similar to the following:
Was this page helpful?