Project Structure
Details about how the Mage file directory works and how typical projects are structured.
Overview
A Mage project lives inside it’s own folder. It’s there where mage data is stored (mage_data
) and individual projects are housed. You can think of a project as an environment: it contains configurations, pipelines, and other project-specific data.
Project structure
Here is a sample project and a sample folder structure:
.
├── mage_data
└── my-first-project
├── charts
├── custom
├── data_exporters
├── data_loaders
├── dbt
├── extensions
├── pipelines
│ └── demo
│ ├── __init__.py
│ └── metadata.yaml
├── scratchpads
├── transformers
├── utils
├── __init__.py
├── io_config.yaml
├── metadata.yaml
└── requirements.txt
Let’s walk through each folder:
- Each Block type has it’s own directory:
- dbt assets are stored in the dbt directory.
- Each pipeline is represented by a YAML file in the
pipelines
folder under the Mage project directory:[project_dir]/pipelines/[pipeline_name]/metadata.yaml
- The
utils
folder is meant to hold custom utilities for your project. For example, Python scripts. - The
extensions
folder is used for Mage extensions that integrate other data tools, like Great Expectations. - Data loader and exporter configurations are stored in the
io_config.yaml
file. - The
metadata.yaml
file contains project-level metadata. There’s a metadata file in each pipeline as well.
Code can be shared across the entire project.
Project organization
We know that you’re hard at work creating blocks and building pipelines, that’s why we’ve made it easy to organize your project.
Blocks can be organized in subdirectories within their respective folders— no additional configuration required. For example, you might have a data_loaders
folder with subdirectories for api
, database
, and file
data loaders.
You can do the same when creating blocks via the UI— simply name the block subfolder_name/block_name
and the subfolder will be automatically created.
This might look like:
.
└── data_loaders
├── api
│ ├── load_bamboo.py
│ ├── load_recurly.py
│ └── load_stripe.py
├── database
| ├── load_duckdb.py
│ └── load_influxdb.py
└── file
├── load_avro.py
└── load_parquet.py
Project data
Project-level data is stored in the mage_data
directory— located at the same level as your project folder.
mage_data
holds project-level cache data and stores the result of Block runs, which are then returned to the user via the UI. Here are the locations for various components of mage_data
:
-
SQLite DB:
mage_data/[project_folder]/mage-ai.db
- The SQLite DB stores project-level metadata, such as pipeline and block run data. Here’s a sample tree tables in the
tables
schema:
└── tables ├── backfill ├── block_run ├── event_matcher ├── oauth2_access_token ├── oauth2_application ├── permission ├── pipeline_run ├── pipeline_schedule ├── pipeline_schedule_event_matcher_association ├── role ├── secret ├── sqlite_master ├── user └── user_role
- You can connect to the SQLite DB like any other database. The JDBC URL
jdbc:sqlite:PATH
, where PATH is/home/src/mage_data/[project_folder]/mage-ai.db
for Docker installs or~/.mage_data/[project_folder]/mage-ai.db
for pip installs.
- The SQLite DB stores project-level metadata, such as pipeline and block run data. Here’s a sample tree tables in the
-
Block output:
mage_data/[project_folder]/pipelines/[pipeline_uuid]/.variables/
-
Pipeline and block logs:
mage_data/[project_folder]/pipelines/[pipeline_uuid]/.logs/
-
Cache:
mage_data/[project_folder]/.cache
To clear project block output, cache data, or log data, consider performing routine operations on this folder. This might be helpful for reducing storage size, optimizing cost, or otherwise maintaining your Mage project.
Changing the name of your project
Docker
If you’re running Mage in Docker, change the environment variable USER_CODE_PATH
to the absolute path where your project lives.
For example, if your original project name was demo_magic
and the directory is located at /home/src/demo_magic
,
then the value of USER_CODE_PATH
should be: /home/src/demo_magic
.
Not using Docker
When you run mage start [project_name]
on the command line to start running Mage,
the 1st command line argument value is used for your project’s name.
Was this page helpful?