1. Configuration
  2. Compute resources

Follow the instructions in this doc to deploy Mage tool to production environment. When running the Mage tool in production, you can customize the compute resource in the following ways:

1. Customize the compute resource of the Mage web service

Mage web serivce is responsbile for running Mage web backend, scheduler service and local block executions. You can customize the CPU and memory of the Mage web service by updating the Terraform variables and then running terraform apply

2. Customize the compute resource of the Mage executor

Mage provides multiple executors to execute blocks.

Local python executor

Local python exeuctors are running within the same container of Mage web service. You can customize the compute resource with the same way mentioned in the Customize the compute resource of the Mage web service section.

Kubernetes executor

If your Mage app is running in a Kubernetes cluster, you can execute the blocks in separate Kubernetes pods with Kubernetes executor.

To configure a pipeline block to use Kubernetes executor, you simply just need to update the executor_type of the block to k8s in pipeline’s metadata.yaml:

blocks:
- uuid: example_data_loader
  type: data_loader
  upstream_blocks: []
  downstream_blocks: []
  executor_type: k8s
  ...

By default, Mage uses default as the Kubernetes namespace. You can customize the namespace by setting the KUBE_NAMESPACE environment variable.

Customizing compute resource for Kubernetes executor is coming soon.

AWS ECS executor

You can choose to launch separate AWS ECS tasks to executor blocks by specifying block executor_type to be ecs in pipeline’s metadata.yaml file.

To customize the compute resource of ECS executor, you can update cpu and memory the ecs_config in project’s metadata.yaml file.

Example config:

ecs_config:
  cpu: 1024
  memory: 2048

GCP Cloud Run executor

If your Mage app is deployed on GCP, you can choose to launch separate GCP Cloud Run jobs to executor blocks.

How to configure pipeline to use GCP cloud run executor:

  1. Update Project’s metadata.yaml
gcp_cloud_run_config:
  path_to_credentials_json_file: "/path/to/credentials_json_file"
  1. Update the executor_type of block to gcp_cloud_run in pipeline’s metadata.yaml:
blocks:
- uuid: example_data_loader
  type: data_loader
  upstream_blocks: []
  downstream_blocks: []
  executor_type: gcp_cloud_run
  ...

Customizing compute resource for GCP Cloud Run executor is coming soon.

PySpark executor

If the pipeline type is “pyspark”, we use PySpark exeuctors for pipeline and block executions. You can customize the compute resource of PySpark exeuctor by updating the instance types of emr_config in project’s metadata.yaml file.

Example config:

emr_config:
  ec2_key_name: "xxxxx"
  master_instance_type: "r5.2xlarge"
  slave_instance_type: "r5.2xlarge"