Generate and orchestrate the final output of a pipeline (aka data product). Data products can be globally referenced from any pipeline and its data output can be used in any block.
global_data_products.yaml
)
under a unique ID (UUID) and it references an existing pipeline.
This feature is important because multiple pipelines can depend on a single global data product,
without having to regenerate the global data product.
In addition, the global data product doesn’t have to run unless something needs its data and
its data is outdated. Its data is outdated if it hasn’t ran for a preset amount of time.
This preset time is configurable.
Example
If you have a computationally expensive data pipeline called users_ltv
that generates the lifetime value (LTV) of each user, and if you have 2 downstream pipelines
that require the data from the users_ltv
pipeline, then using a global data product
will make sure that the users_ltv
pipeline is ran only once as long as the users_ltv
data product isn’t outdated.
Pipeline
.
The object types that are currently supported are:
Pipeline
Block
object type coming soon.
Global data product
.
After selecting an existing global data product, it’ll be added to the current pipeline as
a block.