Schedules
Schedule pipelines to run periodically
You can schedule your pipeline to run on these intervals:
- Once: The pipeline will run only once
- Hourly: The pipeline will run at the beginning of every hour (i.e 0100, 0200, 0300, etc)
- Daily: The pipeline will run at 0000 UTC / GMT every day
- Weekly: The pipeline will run at 0000 UTC / GMT the first day (Monday) of every week
- Monthly: The pipeline will run at 0000 UTC / GMT on the first day of every month
- Always on: The pipeline will be run as soon as the previous pipeline run is finished
- Custom schedule using CRON Expression
- Every minute using CRON expression:
* * * * *
- Streaming pipeline: coming soon Learn more about how to schedule pipelines.
Landing times
If the trigger type is a schedule then instead of choosing when the pipeline will run, you can choose the pipeline’s landing time.
This is useful if you want your data pipeline to finish at a specific time instead of start at a specific time.
For example, a pipeline is scheduled to land daily at 17:00 UTC. If this pipeline takes 10 hours on average to complete, then the pipeline will be triggered and start running at 05:00 UTC each day.
Enable landing time
When editing a trigger with schedule type, there is a toggle called Enable landing time
.
Turn that on to enable landing time for the trigger.
Configure landing time
Depending on the frequency, you’ll be able to configure the following times the pipeline should complete:
- Day of the month
- Day of the week
- Hour of the day
- Minute of the hour
- Second of the minute
Extra runtime variables from pipeline run
Default runtime variables from pipeline run:
Key | Description | Example |
---|---|---|
env | The value is always prod in pipeline runs. The value is dev when running blocks in notebook. | prod |
event | If the pipeline is triggered by event, the event variable contains the event payload. | {"key1": "value1"} |
execution_date | A datetime object that the pipeline is executed at. | 86400 |
execution_partition | An automatically formatted partition of the pipeline run using the execution date. | 10/20231225T122520 |
pipeline_run_id | The id of the pipeline run. | 10 |
trigger_name | The trigger name of the pipeline run. | test run 1 |
If your pipeline run belongs to a trigger that is scheduled, then the following extra
variables are available in your Python block’s keyword arguments (e.g. kwargs
).
Key | Description | Example |
---|---|---|
interval_start_datetime | The datetime when the pipeline run is scheduled for. | datetime.datetime(2023, 7, 23, 7, 0, 0, 0) |
interval_end_datetime | The datetime when the next pipeline run is scheduled for. | datetime.datetime(2023, 7, 24, 7, 0, 0, 0) |
interval_seconds | The number of seconds between the current pipeline run and the next pipeline run. | 86400 |
interval_start_datetime_previous | The datetime when the previous pipeline run was scheduled for. | datetime.datetime(2023, 7, 22, 7, 0, 0, 0) |
SQL block
If you’re using a SQL block, here is an example of how you can access these variables:
SELECT
'{{ interval_start_datetime }}' AS interval_start_datetime
, '{{ interval_end_datetime }}' AS interval_end_datetime
, '{{ interval_seconds }}' AS interval_seconds
, '{{ interval_start_datetime_previous }}' AS interval_start_datetime_previous
Was this page helpful?