Compute resources
Follow the instructions in this doc to deploy Mage tool to production environment. When running the Mage tool in production, you can customize the compute resource in the following ways:
1. Customize the compute resource of the Mage web service
Mage web serivce is responsbile for running Mage web backend, scheduler service
and local block executions. You can customize the CPU and memory of the Mage web
service by updating the Terraform variables and then running terraform apply
- AWS: Update the
ecs_task_cpu
andecs_task_memory
variables in themage-ai-terraform-templates/aws/variables.tf
file. - GCP: Update the
container_cpu
andcontainer_memory
variables in themage-ai-terraform-templates/gcp/variables.tf
file.
2. Set executor type and customize the compute resource of the Mage executor
Mage provides multiple executors to execute blocks. Here are the available executor types:
- Block executor
local_python
ecs
gcp_cloud_run
azure_container_instance
k8s
- Pipeline executor
local_python
ecs
k8s
Mage uses local_python
executor type by default. If you want to specify another executor_type as the default executor type for blocks,
you can set the environment variable DEFAULT_EXECUTOR_TYPE
to one executor type mentioned above.
If you want to use local_python
executor when DEFAULT_EXECUTOR_TYPE
is set to another executor type, you can set the executor_type
to
local_python_force
.
Local python executor
Local python exeuctors are running within the same container of Mage scheduler service. You can customize the compute resource with the same way mentioned in the Customize the compute resource of the Mage web service section.
Kubernetes executor
If your Mage app is running in a Kubernetes cluster, you can execute the blocks in separate Kubernetes pods with Kubernetes executor.
To configure a pipeline block to use Kubernetes executor, you simply just need to update the executor_type
of the block to k8s
in pipeline’s metadata.yaml:
blocks:
- uuid: example_data_loader
type: data_loader
upstream_blocks: []
downstream_blocks: []
executor_type: k8s
...
By default, Mage uses default
as the Kubernetes namespace. You can customize the namespace by setting the KUBE_NAMESPACE
environment variable.
There’re two ways to customize the Kubernetes executor config:
- Add the
executor_config
at block level in pipeline’s metadata.yaml file. Example config:blocks: - uuid: example_data_loader type: data_loader upstream_blocks: [] downstream_blocks: [] executor_type: k8s executor_config: namespace: default resource_limits: cpu: 1000m memory: 2048Mi resource_requests: cpu: 500m memory: 1024Mi
- Add the
k8s_executor_config
to project’s metadata.yaml. Thisk8s_executor_config
will apply to all the blocks that use k8s executor in this project. Example config:k8s_executor_config: job_name_prefix: data-prep namespace: default resource_limits: cpu: 1000m memory: 2048Mi resource_requests: cpu: 500m memory: 1024Mi service_account_name: default
- The kubernetes job name is in this format:
mage-{job_name_prefix}-block-{block_run_id}
. The defaultjob_name_prefix
isdata-prep
. You can customize it in the k8s executor config. - If you want to use GPU resource in your k8s executor, you can configure the GPU resource in the
k8s_executor_config
likePlease make sure the GPU driver is installed and run on your nodes to use the GPUs.k8s_executor_config: resource_limits: gpu-vendor.example/example-gpu: 1 # requesting 1 GPU
- To futher customize the container config of the kubernetes executor, you can sepcify the
container_config
in the k8s executor config. Here is the example:k8s_executor_config: container_config: image: mageai/mageai:0.9.7 env: - name: USER_CODE_PATH value: /home/src/k8s_project
AWS ECS executor
You can choose to launch separate AWS ECS tasks to executor blocks by specifying
block executor_type to be ecs
in pipeline’s metadata.yaml file.
There’re 2 ways to customize the compute resource of ECS executor,
- Update
cpu
andmemory
theecs_config
in project’s metadata.yaml file. Example config:ecs_config: cpu: 1024 memory: 2048
- Add the
executor_config
at block level in pipeline’s metadata.yaml file. Example config:blocks: - uuid: example_data_loader type: data_loader upstream_blocks: [] downstream_blocks: [] executor_type: ecs executor_config: cpu: 1024 memory: 2048
To run the whole pipeline in one ECS executor, you can set the executor_type
at pipeline level and set run_pipeline_in_one_process
to true.
executor_config
can also be set at pipeline level. Here is the example pipeline metadata.yaml:
blocks:
- ...
- ...
executor_type: ecs
run_pipeline_in_one_process: true
name: example_pipeline
...
The default wait timeout for the ECS task is 10 minutes. To customize the timeout, you can specify the wait_timeout
(in seconds) field in ecs_config
. Here is one example:
ecs_config:
cpu: 1024
memory: 2048
wait_timeout: 1200
Required IAM permissions for using ECS executor:
[
"ecs:DescribeTasks",
"ecs:ListTasks",
"ecs:RunTask"
]
GCP Cloud Run executor
If your Mage app is deployed on GCP, you can choose to launch separate GCP Cloud Run jobs to execute blocks.
How to configure pipeline to use GCP cloud run executor:
- Update Project’s metadata.yaml
gcp_cloud_run_config:
path_to_credentials_json_file: "/path/to/credentials_json_file"
project_id: project_id
timeout_seconds: 600
- Update the
executor_type
of block togcp_cloud_run
in pipeline’s metadata.yaml:
blocks:
- uuid: example_data_loader
type: data_loader
upstream_blocks: []
downstream_blocks: []
executor_type: gcp_cloud_run
...
Customizing compute resource for GCP Cloud Run executor is coming soon.
Azure Container Instance executor
If your Mage app is deployed on Microsoft Azure with Mage’s terraform scripts, you can choose to launch separate Azure containce instances to execute blocks.
How to configure pipeline to use Azure Container Instance executor:
- Update Project’s metadata.yaml
azure_container_instance_config:
cpu: 1
memory: 2
- Update the
executor_type
of the block toazure_container_instance
in pipeline’s metadata.yaml and specifyexecutor_config
optionally. The block level executor_config will override the global executor_config.
blocks:
- uuid: example_data_loader
type: data_loader
upstream_blocks: []
downstream_blocks: []
executor_type: azure_container_instance
executor_config:
cpu: 1
memory: 2
...
PySpark executor
If the pipeline type is “pyspark”, we use PySpark exeuctors for pipeline and
block executions. You can customize the compute resource of PySpark exeuctor by
updating the instance types of emr_config
in project’s metadata.yaml file.
Example config:
emr_config:
ec2_key_name: "xxxxx"
master_instance_type: "r5.2xlarge"
slave_instance_type: "r5.2xlarge"