Add credentials

Before you begin, you’ll need to create a service account key. Please read Google Cloud’s documentation on how to create that.

Once your finished, following these steps:

  1. Create a new pipeline or open an existing pipeline.
  2. Expand the left side of your screen to view the file browser.
  3. Scroll down and click on a file named io_config.yaml.
  4. Enter the following keys and values under the key named default (you can have multiple profiles, add it under whichever is relevant to you)
  5. Note: you only need to add the keys under GOOGLE_SERVICE_ACC_KEY or the value for key GOOGLE_SERVICE_ACC_KEY_FILEPATH (both are not simultaneously required. If you use GOOGLE_SERVICE_ACC_KEY_FILEPATH, please delete GOOGLE_SERVICE_ACC_KEY in the io_config.yaml).
version: 0.1.1
default:
  GOOGLE_SERVICE_ACC_KEY:
    type: service_account
    project_id: project-id
    private_key_id: key-id
    private_key:
      "-----BEGIN PRIVATE KEY-----\nyour_private_key\n-----END_PRIVATE_KEY"
    client_email: your_service_account_email
    auth_uri: "https://accounts.google.com/o/oauth2/auth"
    token_uri: "https://accounts.google.com/o/oauth2/token"
    auth_provider_x509_cert_url: "https://www.googleapis.com/oauth2/v1/certs"
    client_x509_cert_url: "https://www.googleapis.com/robot/v1/metadata/x509/your_service_account_email"
  GOOGLE_SERVICE_ACC_KEY_FILEPATH: "/path/to/your/service/account/key.json"

Required permissions

bigquery.datasets.create
bigquery.datasets.get
bigquery.jobs.create
bigquery.readsessions.create
bigquery.readsessions.getData
bigquery.tables.create

If you’re running queries in the existing BigQuery dataset, make sure your account also have “BigQuery Data Editor” role for the BigQuery dataset.

Using SQL blocks

  1. Create a new pipeline or open an existing pipeline.
  2. Add a data loader, transformer, or data exporter block.
  3. Select SQL.
  4. Under the Data provider dropdown, select BigQuery.
  5. Under the Profile dropdown, select default (or the profile you added credentials underneath).
  6. Next to the Database label, enter the database name you want this block to save data to.
  7. Next to the Save to schema label, enter the schema name you want this block to save data to.
  8. Under the Write policy dropdown, select Replace or Append (please see SQL blocks guide for more information on write policies).
  9. Enter in this test query: SELECT 1.
  10. Run the block.

Using Python blocks

  1. Create a new pipeline or open an existing pipeline.
  2. Add a data loader, transformer, or data exporter block (the code snippet below is for a data loader).
  3. Select Generic (no template).
  4. Enter this code snippet (note: change the config_profile from default if you have a different profile):
from mage_ai.settings.repo import get_repo_path
from mage_ai.io.bigquery import BigQuery
from mage_ai.io.config import ConfigFileLoader
from os import path
from pandas import DataFrame

if 'data_loader' not in globals():
    from mage_ai.data_preparation.decorators import data_loader


@data_loader
def load_data_from_big_query(**kwargs) -> DataFrame:
    query = 'SELECT 1'
    config_path = path.join(get_repo_path(), 'io_config.yaml')
    config_profile = 'default'

    return BigQuery.with_config(ConfigFileLoader(config_path, config_profile)).load(query)
  1. Run the block.

Export a dataframe

Here is an example code snippet to export a dataframe to BigQuery:

from mage_ai.settings.repo import get_repo_path
from mage_ai.io.bigquery import BigQuery
from mage_ai.io.config import ConfigFileLoader
from pandas import DataFrame
from os import path

if 'data_exporter' not in globals():
    from mage_ai.data_preparation.decorators import data_exporter


@data_exporter
def export_data_to_big_query(df: DataFrame, **kwargs) -> None:
    """
    Template for exporting data to a BigQuery warehouse.
    Specify your configuration settings in 'io_config.yaml'.

    Docs: https://docs.mage.ai/design/data-loading#bigquery
    """
    table_id = 'your-project.your_dataset.your_table_name'
    config_path = path.join(get_repo_path(), 'io_config.yaml')
    config_profile = 'default'

    BigQuery.with_config(ConfigFileLoader(config_path, config_profile)).export(
        df,
        table_id,
        if_exists='replace',  # Specify resolution policy if table name already exists
        overwrite_types=None, # Specify the column types to overwrite in a dictionary
    )

To overwrite a column type when running a python export block, simply specify the column name and type in the overwrite_types dict in data exporter config. You can find the supported types in this doc. Here is the example code:

    BigQuery.with_config(ConfigFileLoader(config_path, config_profile)).export(
        df,
        table_id,
        if_exists='replace',  # Specify resolution policy if table name already exists
        overwrite_types={'col_name': 'STRING'}, # Specify the column types to overwrite
    )

Was this page helpful?