Add credentials
- Create a new pipeline or open an existing pipeline.
- Expand the left side of your screen to view the file browser.
- Scroll down and click on a file named
io_config.yaml. - Enter the following keys and values under the key named
default(you can have multiple profiles, add it under whichever is relevant to you)
SPARK_HOST to the Spark master URL (for example, spark://host:7077, local[*], or another valid SparkSession.builder.master(...) value). You can also set SPARK_METHOD (e.g. session) and SPARK_SCHEMA to control how the session is created and which default database/schema is used.
Using Python block
- Create a new pipeline or open an existing pipeline.
- Add a data loader, transformer, or data exporter block (the code snippet below is for a data loader).
- Select
Generic (no template). - Enter this code snippet (note: change the
config_profilefromdefaultif you have a different profile):
- Run the block.
Export a dataframe to Spark
Notes
- Spark runs in-process; ensure PySpark and any required cluster dependencies are installed in your Mage environment.
- For local development,
SPARK_HOST: localtypically creates a session withSparkSession.builder.master('local').getOrCreate(). - Use
SPARK_SCHEMAto set the default database/schema for queries and exports.