Amazon EC2

Mage can be run in an Amazon EC2 instance, either by

  • using SSH to port forward localhost requests to your EC2 instance
  • opening port 6789 on your EC2 instance for access

Prerequistes

Your EC2 instance must have python3 installed. All Python versions between 3.7.0 (inclusive) and 3.10.0 (exclusive) are supported.

When launching an EC2 instance, we suggest using the following AMI:

AMI IDAMI name
ami-0d70546e43a941d70ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20220609

Quick Connection via SSH

Install Python

Once the EC2 instance is in a running state, SSH into the instance. You can SSH using the UI in AWS or you can SSH from your terminal using a command like the following:

ssh -i [path_to_key_pair] [ec2_username]@[ec2_public_dns_name]

Note: the ec2_username is typically ubuntu if you used the AMI we recommended above.

An example SSH command could look like this:

ssh -i ~/.ssh/aws-ec2.pem ubuntu@ec2-55-186-46-136.us-west-2.compute.amazonaws.com

Once you SSH into the instance, run the following command to install Python and pip:

sudo apt update && \
  sudo apt install -y software-properties-common && \
  sudo add-apt-repository -y ppa:deadsnakes/ppa && \
  sudo apt install -y python3.7 \
  sudo apt install -y python3-pip

Run script to start Mage

We provide the scripts/run_ec2.sh script to launch Mage in an EC2 instance. Currently only connections via SSH are supported. To access this script clone this repository.

To run Mage in an EC2 instance, you need to provide

  • Path to the key pair used with the EC2 instance
  • Your EC2 instance username
  • Your EC2 public DNS name
./scripts/run_ec2.sh [path_to_key_pair] [ec2_user_name] [ec2_public_dns_name]

Note: the ec2_username is typically ubuntu if you used the AMI we recommended above.

You can optionally add --name custom_repo_name to the end of the command above to name your project something different than the default of default_repo.

An example command could look like this:

./scripts/run_ec2.sh ~/.ssh/aws-ec2.pem ubuntu@ec2-55-186-46-136.us-west-2.compute.amazonaws.com --name demo_project

This script will

  1. Connect to your EC2 instance
  2. Install Mage if not already installed on instance
  3. Create a default repository (default repository name is ‘default_repo’, use --name to specify your own custom repository name)
  4. Launch the Mage tool pointing at this repository

To access the Mage tool, open localhost:6789. All actions made here will be forwarded to your EC2 instance for execution.

To quit out of Mage, stop execution in your current terminal window (Ctrl+C). This will shutdown the Mage tool alongside closing the connection to your EC2 instance.


Manual connection via Open Port

You can also connect to the Mage app running on your EC2 instance by editing your security group to expose the port that Mage runs on.

  1. When creating your EC2 instance, edit your security group rules to allow a Custom TCP connection to port 6789 (the port that Mage runs on).

  2. Connect to your EC2 instance via SSH and install Mage if you haven’t already:

    ssh -i path/to/key/pair.pem ec2_username@ec2_public_dns_name
    pip install mage-ai
    
  3. Start Mage on your EC2 instance:

    mage init <your_repo_name>
    mage start <your_repo_name>
    
  4. Then from your browser, access <ec2-instance-public-ip>:6789 to access the Mage app, where ec2-instance-public-ip is the Public IPv4 address of your EC2 instance.


Manual connection via SSH

You can also manually start a connection to the Mage tool running on an EC2 instance. This process involves opening two separate SSH connections:

  1. Connect to the EC2 instance to run the Mage Tool. First, open an SSH connection with your EC2 Key Pair and your EC2 login information to connect the EC2 instance.

    ssh -i path/to/key/pair.pem ec2_username@ec2_public_dns_name
    

    If the Mage tool is not installed, install Mage using pip. It is recommended to use a virtual environment to store the Python packages to isolate your Python dependencies per project.

    python3 -m venv env
    source ./env/bin/activate
    pip3 install mage-ai
    

    Next, initalize your Mage repository using mage init. You can change your repository name as per your preference.

    mage init default_repo
    

    Then start the Mage app. This will start the notebook at localhost:6789, but you can’t directly access it yet as this notebook is started and is executed on the EC2 instance

    mage start default_repo
    
  2. Forward all requests sent to localhost:6789 on your computer back to the EC2 instance. This means that as you interact the user interface on your computer, the data will be forwarded to the EC2 instance for execution. Open another terminal window and run

    ssh -i path/to/key/pair.pem -N -L 6789:localhost:6789 ec2_username@ec2_public_dns_name
    

    which opens up a SSH connection to your EC2 instance which forwards requests to your localhost.

    • The -N option tells your SSH client to not send user commands to the EC2 server through this connection since this connection is only for port forwarding

Now you can access the Mage tool at localhost:6789 on your computer, and all executions will be processed on your EC2 instance!


Amazon ECS

You can launch Mage in an Amazon Elastic Container Service (ECS) cluster, allowing you to perform tasks like:

Running the Mage App in ECS

Follow the steps below to launch the Mage app on your ECS cluster:

  1. Create a new Docker image for launching the Mage App— you can start with our template. Upload this Docker image to a repository like Amazon ECR so ECS tasks can pull and create Docker containers using this image.

  2. Create a new ECS cluster. Both Fargate and EC2 based clusters work for this task.

  3. Create a new task definition based off the template below. This task, when run:

    1. Creates a new Mage repository named “default_repo”.
    2. Starts the Mage app, serving requests sent to “localhost:6789”
    {
      "containerDefinitions": [
        {
          "name": "mage-data-prep-start",
          "image": "[aws-account-id].dkr.ecr.[region].amazonaws.com/[your-image-repo-name]:[tag]",
          "portMappings": [
            {
              "containerPort": 6789,
              "hostPort": 6789,
              "protocol": "tcp"
            }
          ],
          "essential": true,
          "entryPoint": ["sh", "-c"],
          "command": ["mage init default_repo && mage start default_repo"],
          "interactive": true,
          "pseudoTerminal": true
        }
      ],
      "family": "mage-data-prep",
      "networkMode": "awsvpc",
      "requiresCompatibilities": ["FARGATE", "EC2"],
      "cpu": "256",
      "memory": "512",
      "executionRoleArn": "arn:aws:iam::[aws-account-id]:role/[ecs-task-execution-role-name]"
    }
    

    You must specify:

    • "image" - Docker image URI. If using ECR, you can use the template above and fill in the following information:

      ParameterDescription
      aws-account-idAWS Account ID
      regionRegion in which ECR is being used
      your-image-repo-nameName of the repository holding your repo
      tagtag for the version of the image to use
    • "executionRoleArn" - Name of the ECS Task Execution Role. Assigning this role enables this task to make AWS API calls. This is needed to pull the Docker image from an ECR repository. Fill in the following information:

      ParameterDescription
      aws-account-idAWS Account ID
      ecs-task-execution-role-nameName of task execution IAM role

      This parameter can be ignored if you are not using Amazon ECR

    In addition, you may want to edit the allocated CPU and Memory resources based on the type of data you plan to handle and intensity of the computations performed.

  4. Launch the task in your new cluster. Make sure the following conditions are met:

    • Task is launched in a VPC with a public subnet and has a Public IP assigned. Allows you to connect to the ECS task and use the code editor.
    • Task security group allows an inbound TCP connection to port 6789 from your IP (or the IP with which the app will be accessed). Enables you to connect to port 6789 where the Mage app is running.
    • The task has network routes to any service that you plan to connect to, such as data warehouses or Amazon ECR. Simplest way to add these network routes are to add outbound connection rules to the security group.
  5. Find the public IP of your task. Then to access the Mage app, go to your-public-ip:6789. If you followed the previous steps correctly, you should be able to access the Mage app running in your cluster.

Running Mage Pipeline in ECS

If you already have a Mage pipeline developed, you can deploy the pipeline as an ECS task that periodically executes.

Follow the steps below to setup running a Mage pipeline in ECS.

Prerequisite: You must have already created a Mage repository containing the pipeline you wish to deploy to ECS. Make sure this repository includes any data files and configuration setting files needed for the pipeline to run.

  1. Create a new Docker image for running your pipeline— you can start with our template. Make sure to add any other environment variables defined locally to the image to be able to access those secrets in the cluster. Upload this Docker image to a repository like Amazon ECR so ECS tasks can pull and deploy containers using the image.

  2. If not already created, create a new ECS cluster. This task can be run on both Fargate and EC2 based instances

  3. Create a new task definition based off the template below. This task calls mage run on your pipeline when started.

    {
      "containerDefinitions": [
        {
          "name": "mage-data-prep-deploy",
          "image": "[aws-account-id].dkr.ecr.[region].amazonaws.com/[your-image-repo-name]:[tag]",
          "essential": true,
          "entryPoint": ["sh", "-c"],
          "command": ["mage run default_repo [your-pipeline-name]"],
          "interactive": true,
          "pseudoTerminal": true
        }
      ],
      "family": "mage-data-prep",
      "networkMode": "awsvpc",
      "requiresCompatibilities": ["FARGATE"],
      "cpu": "1024",
      "memory": "2048",
      "executionRoleArn": "arn:aws:iam::[aws-account-id]:role/[ecs-task-execution-role-name]"
    }
    

    You must specify:

    • "image" - Docker image URI. If using ECR, you can use the template above and fill in the following information:

      ParameterDescription
      aws-account-idAWS Account ID
      regionRegion in which ECR is being used
      your-image-repo-nameName of the repository holding your repo
      tagtag for the version of the image to use
    • "executionRoleArn" - Name of the ECS Task Execution Role. Assigning this role enables this task to make AWS API calls. This is needed to pull the Docker image from an ECR repository. Fill in the following information:

      ParameterDescription
      aws-account-idAWS Account ID
      ecs-task-execution-role-nameName of task execution IAM role

      This parameter can be ignored if you are not using Amazon ECR

    • your-pipeline-name - name of the pipeline to run. This pipeline must be stored in the repository you added to your Docker image.

    In addition, you may want to edit the allocated CPU and Memory resources based on the type of data you plan to handle and intensity of the computations performed.

  4. Launch the task in your new cluster. Make sure your task has network routes to any service that you plan to connect to, such as data warehouses or Amazon ECR. Simplest way to add these network routes are to add outbound connection rules to the security group.

  5. Your pipeline is now running in the ECS task