- Backfills
- Backfilling pipelines
Backfills
Backfilling pipelines
Run a pipeline multiple times.
A backfill creates 1 or more pipeline runs for a pipeline. There are 2 types of backfills:
- Date and time window
- Custom code
Date and time window
Create 1 or more pipeline runs between 2 datetime values.
The datetime of each instance is used as the execution_date
for the pipeline run.
For example, if the backfill has the following attributes:
Attribute | Value |
---|---|
Start datetime | 2023-01-01T03:00:00 |
End datetime | 2023-01-05T03:00:00 |
Interval type | day |
Interval units | 2 |
Then the following pipeline runs will be created:
id | execution_date | ds | hr |
---|---|---|---|
1 | 2023-01-01T03:00:00 | 2023-01-01 | 03 |
2 | 2023-01-03T03:00:00 | 2023-01-03 | 03 |
Custom code
The output of a backfill code will be used to generate the pipeline runs.
For example, if the backfill code has the following content:
backfill_data = []
for index, _ in range(3):
backfill_data.append(dict(
partition=index,
power=5,
))
backfill_data
Then the following pipeline runs will be created:
id | execution_date | ds | hr | partition | power |
---|---|---|---|---|---|
1 | 2023-01-01T00:00:00 | 2023-01-01 | 00 | 0 | 5 |
2 | 2023-01-01T00:00:00 | 2023-01-01 | 00 | 1 | 5 |
3 | 2023-01-01T00:00:00 | 2023-01-01 | 00 | 2 | 5 |