Compute resources
Follow the instructions in this doc to deploy Mage tool to production environment. When running the Mage tool in production, you can customize the compute resource in the following ways:
1. Customize the compute resource of the Mage web service
Mage web service is responsibile for running Mage web backend, scheduler service
and local block executions. You can customize the CPU and memory of the Mage web
service by updating the Terraform variables and then running terraform apply
- AWS: Update the
ecs_task_cpu
andecs_task_memory
variables in themage-ai-terraform-templates/aws/variables.tf
file. - GCP: Update the
container_cpu
andcontainer_memory
variables in themage-ai-terraform-templates/gcp/variables.tf
file.
2. Set executor type and customize the compute resource of the Mage executor
Mage provides multiple executors to execute blocks. Here are the available executor types:
- Block executor
local_python
ecs
gcp_cloud_run
azure_container_instance
k8s
- Pipeline executor
local_python
ecs
k8s
Set executor type for all blocks: Mage uses local_python
executor type by default. If you want to specify another executor_type as the default executor type for blocks,
you can set the environment variable DEFAULT_EXECUTOR_TYPE
to one executor type mentioned above.
Set executor type for all blocks in a pipeline: You can set the executor type for all the blocks in a pipeline by specifying the executor_type
at pipeline level.
Force local python executor: If you want to use local_python
executor when DEFAULT_EXECUTOR_TYPE
is set to another executor type, you can set the executor_type
to
local_python_force
.
Local python executor
Local python exeuctors are running within the same container of Mage scheduler service. You can customize the compute resource with the same way mentioned in the Customize the compute resource of the Mage web service section.
Kubernetes executor
If your Mage app is running in a Kubernetes cluster, you can execute the blocks in separate Kubernetes pods with Kubernetes executor.
To configure a pipeline block to use Kubernetes executor, you simply just need to update the executor_type
of the block to k8s
in pipeline’s metadata.yaml:
blocks:
- uuid: example_data_loader
type: data_loader
upstream_blocks: []
downstream_blocks: []
executor_type: k8s
...
By default, Mage uses default
as the Kubernetes namespace. You can customize the namespace by setting the KUBE_NAMESPACE
environment variable.
There’re three ways to customize the Kubernetes executor config:
- Add the
executor_config
at block level in pipeline’s metadata.yaml file. Example config:blocks: - uuid: example_data_loader type: data_loader upstream_blocks: [] downstream_blocks: [] executor_type: k8s executor_config: namespace: default resource_limits: cpu: 1000m memory: 2048Mi resource_requests: cpu: 500m memory: 1024Mi
- Add the
k8s_executor_config
to project’s metadata.yaml. Thisk8s_executor_config
will apply to all the blocks that use k8s executor in this project. Example config:k8s_executor_config: job_name_prefix: data-prep namespace: default resource_limits: cpu: 1000m memory: 2048Mi resource_requests: cpu: 500m memory: 1024Mi service_account_name: default
- The kubernetes job name is in this format:
mage-{job_name_prefix}-block-{block_run_id}
. The defaultjob_name_prefix
isdata-prep
. You can customize it in the k8s executor config. You can interpolate the trigger name in thejob_name_prefix
field with the format{trigger_name}
- If you want to use GPU resource in your k8s executor, you can configure the GPU resource in the
k8s_executor_config
likePlease make sure the GPU driver is installed and run on your nodes to use the GPUs.k8s_executor_config: resource_limits: gpu-vendor.example/example-gpu: 1 # requesting 1 GPU
- To further customize the container config of the kubernetes executor, you can sepcify the
container_config
orjob
in the k8s executor config. Here is the example:k8s_executor_config: container_config: image: mageai/mageai:0.9.7 env: - name: USER_CODE_PATH value: /home/src/k8s_project job: active_deadline_seconds: 120 backoff_limit: 3 ttl_seconds_after_finished: 86400
- You can configure the job template by using the
K8S_CONFIG_FILE
environment variable, which should point to the configuration file. Here is the format for the Kubernetes configuration template:
# Kubernetes Configuration Template
metadata:
annotations:
application: "mage"
composant: "executor"
labels:
application: "mage"
type: "spark"
namespace: "default"
pod:
service_account_name: ""
image_pull_secrets: "secret"
volumes:
- name: data-pvc
persistent_volume_claim:
claim_name: pvc-name
container:
name: "mage-data"
env:
- name: "KUBE_NAMESPACE"
value: "default"
- name: "secret_key"
value: "somesecret"
image: "mageai/mageai:latest"
image_pull_policy: "IfNotPresent"
resources:
limits:
cpu: "1"
memory: "1Gi"
requests:
cpu: "0.1"
memory: "0.5Gi"
volume_mounts:
- mount_path: "/tmp/data"
name: "data-pvc"
NB: When deploying Mage in a multi-container pod, you need to specify the environment variable MAGE_CONTAINER_NAME
. If this variable is not set, Mage will default to using the first container in the pod. To specify the Mage container, you can use:
env:
- name: MAGE_CONTAINER_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
AWS ECS executor
You can choose to launch separate AWS ECS tasks to executor blocks by specifying
block executor_type to be ecs
in pipeline’s metadata.yaml file.
There’re 2 ways to customize the ECS executor config,
- Specify the
ecs_config
in project’s metadata.yaml file. Example config:ecs_config: cpu: 1024 memory: 2048
- Add the
executor_config
at block level in pipeline’s metadata.yaml file. Example config:blocks: - uuid: example_data_loader type: data_loader upstream_blocks: [] downstream_blocks: [] executor_type: ecs executor_config: cpu: 1024 memory: 2048
To run the whole pipeline in one ECS executor, you can set the executor_type
at pipeline level and set run_pipeline_in_one_process
to true.
executor_config
can also be set at pipeline level. Here is the example pipeline metadata.yaml:
blocks:
- ...
- ...
executor_type: ecs
run_pipeline_in_one_process: true
name: example_pipeline
...
Config fields
Field name | Description | Example values |
---|---|---|
assign_public_ip | Whether to assign public IP to the ECS task. | true/false (default: true) |
cpu | The CPU allocated to the ECS task. | 1024 |
enable_execute_command | Whether to enable execute command for debugging | true/false (default: false) |
launch_type | The launch type of the ECS task. | FARGATE |
memory | The memory allocated to the ECS task. | 2048 |
tags | The tags of the ECS task. | [‘tag1’, ‘tag2’] |
wait_timeout | The maximum wait time for the ECS task (in seoncds). The default wait timeout for the ECS task is 10 minutes. Setting to -1 will disable waiting. | 1200 (default: 600) |
Example config
ecs_config:
cpu: 1024
memory: 2048
assign_public_ip: false
enable_execute_command: true
wait_timeout: 1200
Required IAM permissions for using ECS executor:
[
"ec2:DescribeNetworkInterfaces",
"ecs:DescribeTasks",
"ecs:ListTasks",
"ecs:RunTask"
]
GCP Cloud Run executor
If your Mage app is deployed on GCP Cloud Run, you can choose to launch separate GCP Cloud Run jobs to execute blocks.
How to configure pipeline to use GCP cloud run executor:
- Update Project’s metadata.yaml
gcp_cloud_run_config:
path_to_credentials_json_file: "/path/to/credentials_json_file"
project_id: project_id
timeout_seconds: 600
- Update the
executor_type
of block togcp_cloud_run
in pipeline’s metadata.yaml:
blocks:
- uuid: example_data_loader
type: data_loader
upstream_blocks: []
downstream_blocks: []
executor_type: gcp_cloud_run
...
Customizing compute resource for GCP Cloud Run executor is coming soon.
Azure Container Instance executor
If your Mage app is deployed on Microsoft Azure with Mage’s terraform scripts, you can choose to launch separate Azure containce instances to execute blocks.
How to configure pipeline to use Azure Container Instance executor:
- Update Project’s metadata.yaml
azure_container_instance_config:
cpu: 1
memory: 2
- Update the
executor_type
of the block toazure_container_instance
in pipeline’s metadata.yaml and specifyexecutor_config
optionally. The block level executor_config will override the global executor_config.
blocks:
- uuid: example_data_loader
type: data_loader
upstream_blocks: []
downstream_blocks: []
executor_type: azure_container_instance
executor_config:
cpu: 1
memory: 2
...
PySpark executor
If the pipeline type is “pyspark”, we use PySpark exeuctors for pipeline and
block executions. You can customize the compute resource of PySpark exeuctor by
updating the instance types of emr_config
in project’s metadata.yaml file.
Example config:
emr_config:
ec2_key_name: "xxxxx"
master_instance_type: "r5.2xlarge"
slave_instance_type: "r5.2xlarge"
Was this page helpful?