PostgreSQL
Configuration
To set up the PostgreSQL source, provide the following connection parameters:
Key | Description | Sample Value |
---|---|---|
database | Name of the database you want to extract data from. | demo |
host | Hostname of your PostgreSQL database. | db.bit.io |
password | Password for the PostgreSQL user. | abc123... |
port | Port number where the database is accessible (typically 5432 ). | 5432 |
schema | Schema containing the tables you want to extract. | public |
username | Username with permissions to read and write from the specified schema. | guest |
replication_slot | Logical replication slot name for Change Data Capture (CDC). | mage_slot |
publication_name | Logical replication publication name for CDC. | mage_pub |
Optional Configuration
Key | Description | Sample Value |
---|---|---|
batch_fetch_limit | Number of rows to fetch per batch (default is 50,000). Adjust if your instance has sufficient memory. | 50000 |
Why Use PostgreSQL as a Source in Mage?
- Real-time change data capture (CDC): Capture and sync only changed rows.
- Reliable large-scale data ingestion: Load millions of rows efficiently.
- Flexible schema support: Work with multiple schemas in your PostgreSQL instance.
- Secure and scalable: Support for SSL connections, logical replication slots, and publication management.
Change Data Capture (CDC)
Mage supports 2 types of change data capture with PostgreSQL:
- Batch query
- Log replication
Batch query
Mage will query PostgreSQL in batches using SELECT
, WHERE
, and ORDER BY
statements.
Log replication
Mage supports Change Data Capture (CDC) for PostgreSQL using logical replication, enabling real-time tracking of inserts, updates, and deletes without requiring full table reloads. By reading PostgreSQL’s Write-Ahead Logs (WAL), Mage captures changes efficiently and applies them to the destination system — inserting new records, updating existing ones, or deleting as needed — ensuring low-latency, incremental data synchronization.
Setup in PostgreSQL
-
Enable logical replication in Postgres config
- Local Postgres
-
Open the
postgresql.conf
file. Here is an example location on Mac OSX:/Users/Mage/Library/Application Support/Postgres/var-14/postgresql.conf
. -
Under the settings section, change the value of
wal_level
tological
. The line in yourpostgresql.conf
file should look like this: -
Restart the PostgreSQL service or database. You can do this via the PostgreSQL app or if you’re on Linux, run the following commands:
-
Run the following query in your PostgreSQL database:
SHOW wal_level
. The result should be:wal_level
logical
-
- AWS RDS/Aurora
- Follow this doc to set up PostgreSQL logical replication for an Aurora PostgreSQL DB cluster
- Logical replication for AWS RDS (PostgreSQL) can be set up in a similar way with the Aurora cluster.
- Azure Database for PostgreSQL
-
Azure Database for PostgreSQL supports logical replication for Flexible Server and Single Server deployments.
-
Follow this Azure guide to enable logical replication on Azure PostgreSQL Flexible Server.
-
Make sure the following parameters are set in the Azure portal:
wal_level
=logical
max_replication_slots
≥ 1max_wal_senders
≥ 1
-
Grant replication privileges to your user:
-
Azure also requires that you configure firewall rules to allow your Mage server or IP range to access the PostgreSQL server.
-
For more detailed instructions, see: Set up logical replication on Azure PostgreSQL.
-
- Local Postgres
-
Run the following command in PostgreSQL to create a replication slot:
The replication slot namemage_slot
quoted is used as an example in the guide. The actual replication slot name should be unique for each pipeline source.The result should looking something like this:
pg_create_logical_replication_slot
(mage_slot,0/51A80778)
-
Create a publication for all tables or for 1 specific table using the following commands:
mage_pub
is used in Mage’s codeor for 1 table:
Replacesome_schema
with the schema of the table andsome_table_name
with the name of the table you want to replicate.Run the following command to add or remove a table from publication.
-
Verify that the publication was created successfully by running the following command in PostgreSQL:
The result should looking something like this:
pubname
schemaname
tablename
mage_pub
public
users
-
Grant the database user permission to read the replication slot:
Create data integration pipeline in Mage
Follow this guide to create a data integration pipeline in Mage.
However, choose PostgreSQL as the source and choose LOG_BASED
as the
replication method.
Testing pipeline end-to-end
Once you’ve created the pipeline, add a few rows into your PostgreSQL table that you just created a logical replication for.
You can use the INSERT
command to add rows. For example:
Replace some_schema
with the schema of the table and some_table_name
with the name
of the table you want to replicate.
Change the VALUES
to match the columns in your table.
Verify replication logs being created
Run the following commands in PostgreSQL to check for new logs:
Run sync
After you added a few new rows, create a trigger to start running your pipeline and begin syncing data.
What is PostgreSQL?
PostgreSQL is an open-source, enterprise-grade relational database system. Known for its robustness, extensibility, and SQL compliance, PostgreSQL powers thousands of applications worldwide in both small businesses and large enterprises.
Common use cases include:
- Transactional applications
- Analytical workloads
- Hybrid OLTP + OLAP systems
- Data warehousing
- Business intelligence (BI) and reporting pipelines