Storage
Chroma
Credentials
Open the file named io_config.yaml
at the root of your Mage project and enter chroma required fields:
version: 0.1.1
default:
CHROMA_COLLECTION: collection_name
CHROMA_PATH: path of the chroma persisitant storage
Dependencies
The dependency libraries are not installed in the docker image by default. You’ll need to add the libraries to
project requirements.txt
file manually and install them.
chromadb>=0.4.17
Using Python block
- Create a new pipeline or open an existing pipeline.
- Add a data loader or data exporter with Template. Under “Databases” category you can find the “Chroma” template.
-
Chroma data loader arguments:
- n_results: Number of results to match.
- collection: collection to use. Otherwise the collection defined in io_config.yml will be used.
- query_texts: the texts or documents used to query.
- query_embeddings: the embeddings used to query.
Either
query_texts
orquery_embeddings
should be available.
-
Chroma data exporter arguments:
- df: Dataframe contains the actual data.
- collection: collection to use. Otherwise the collection defined in io_config.yml will be used.
- document_column: specify the column in df dataframe that contains documents to be stored in Chroma.
- id_column: specify the column in df dataframe that contains the id of the document
- metadata_column: specify the column in df dataframe that contains the metadata of the document (metadata should be dictionary format)
- Add your customized code into the loader, exporter or add extra transformer blocks.
- Run the block.
Was this page helpful?