Data exporter
After completing data transformations, utilize the data exporter blocks to either load the processed data or store a machine learning model in an external data storage system.
Mage natively supports integration with a variety of data storage systems. However, these integrations often
require specific configurations in both the exporter block and the io_config.yml
file to ensure seamless operation.
The io_config.yml
file typically includes connection details such as host, port, database name, username, and password.
Meanwhile, the exporter block needs to be configured with the appropriate export parameters, such as target table names,
schema details, and conflict resolution strategies.
Technical Details
- Data Exporter Blocks:
- These blocks are designed to facilitate the movement of transformed data or trained models to external systems.
- Configuration parameters might include destination paths, file formats, table names, schemas, and update strategies.
- Most data exporters include a
config_profile
parameter set to'default'
by default. This parameter can be customized to use different configuration profiles if you have multiple profiles or have renamed them.
- Supported Data Storage Systems:
- Mage supports a wide range of storage systems including PostgreSQL, MySQL, AWS S3, Google Cloud Storage, Azure Blob Storage, and many more.
- Each system may have unique requirements and configurations to ensure compatibility and optimal performance.
- Configuration in
io_config.yml
:- This file serves as the central configuration hub for defining connection parameters.
- Typical parameters include:
host
: The server address of the storage system.port
: The port number for the connection.database
: The name of the target database or data storage container.username
andpassword
: Authentication credentials.- Additional parameters as required by specific storage systems (e.g., SSL settings, API tokens).
Examples
Data Warehouse
Data Warehouse
Configure the io_config.yml
file to connect your Mage pipeline to a snowflake data warehouse. While optional,
depending on how your Snowflake DW is configured you may need to enter all information into the .yml file.
It’s recommended to store sensitive information as Secrets. See the general Secrets documentation
for more information.
Enter information for the following in the Data Exporter block
- table_name - requires developers enter the name of their destination table
- database - requires developers enter the name of their destination data base
- schema - requires developers enter the name of their destination schema
All other information is handled in the ‘io_config.yml’ file.
Example Code:
Data Lake
Data Lake
Configure the io_config.yml
file to connect your Mage pipeline to a Azure Blob Storage. Configure some Secrets and enter them into io_config.yml
file.
If you need more information on entering secrets see this documentation.
Enter information for the following in the Data Exporter block
- container_name - requires developers enter the name of their destination container
- blob_path - requires developers enter the name of their destination blob path
All other information is handled in the io_config.yml
file.
Example Code:
Database
Database
Configure the io_config.yml
file to connect your Mage pipeline to a PostgreSQL database. Configure some Secrets and enter them into io_config.yml
file.
If you need more information on entering secrets see this documentation.
If exporting from Docker to an external machine use the host.docker.internal for POSTGRES_HOST:
Enter information for the following in the Data Exporter block
- schema_name - requires developers enter the name of their destination container
- table_name - requires developers enter the name of their destination blob path
All other information is handled in the io_config.yml
file.
Example Code:
Delta Lake
Delta Lake
Unlike other data exporters, delta lake exporters do not currently configure through the io_config.yml
file.
They contain the necessary configurations within the exporter block itself. Lets break that down.
Storage Options
'AWS_ACCESS_KEY_ID'
: Your AWS access key ID.'AWS_SECRET_ACCESS_KEY'
: Your AWS secret access key.'AWS_REGION'
: The AWS region where your S3 bucket is located.'AWS_S3_ALLOW_UNSAFE_RENAME'
: This option allows unsafe rename operations on S3, which might be necessary for some workflows.
Remember Secrets can be stored in Mage’s internal Secrets Manager, .YAML files, or sync directly with Cloud Secret Managers.
Additional Configurations
uri
: The S3 URI where the Delta Table is stored.
Example Code:
By correctly configuring these components, you can effectively streamline the data loading process into your chosen storage system, whether it be a relational database, a data lake, or a machine learning model repository.