PySpark executor

If the pipeline type is pyspark, we use PySpark executors for pipeline and block executions. You can customize the compute resource of PySpark executor by updating the instance types of emr_config in project’s metadata.yaml file.

Example:

emr_config:
  ec2_key_name: "xxxxx"
  master_instance_type: "r5.2xlarge"
  slave_instance_type: "r5.2xlarge"

Spark compute resource manager

Manage your Spark compute resources and track Spark pipeline execution metrics. Learn more

AWSExecute block runs in separate tasks.

On this page

Example:
Spark compute resource manager

Deployment

Cloud infrastructure

Compute resources

Secret manager

Authentication

PySpark executor

Example:

Spark compute resource manager

Deployment

Cloud infrastructure

Compute resources

Secret manager

Authentication

​Example:

​Spark compute resource manager

Example:

Spark compute resource manager