Documentation Index
Fetch the complete documentation index at: https://docs.mage.ai/llms.txt
Use this file to discover all available pages before exploring further.
Configuration
Connect securely to an SFTP server and load files into your Mage pipelines by providing the following configuration parameters:| Key | Description | Sample Value |
|---|---|---|
host | Hostname or IP address of the SFTP server. | sftp.example.com |
port | Port number for the SFTP server. | 22 |
username | Username to authenticate with the SFTP server. | myuser |
password | Password for authentication (optional if using private key). | mypassword |
private_key_file | Path to the private key file for key-based authentication (optional). | /home/user/.ssh/id_rsa |
start_date | Earliest file date to synchronize (YYYY-MM-DD format). | 2021-01-28 |
tables | An array defining table configurations for loading files (required). | See table config below |
decryption_config | Configuration for decrypting GPG-encrypted files (optional). | See decryption config below |
Tables Configuration
Define how files are processed and structured into tables:Required Parameters
| Key | Description | Sample Value |
|---|---|---|
table_name | Name assigned to the table (stream). | customer_orders |
search_prefix | Directory path where the target files are located. | /Export/SubFolder |
search_pattern | Regex pattern to match filenames for ingestion. | MyExportData.*\\.zip.gpg$ |
delimiter | Delimiter used to separate fields (default is ,). | , |
Optional Parameters
| Key | Description | Sample Value |
|---|---|---|
key_properties | Array of unique keys for the table. Defaults to ['_sdc_source_file', '_sdc_source_lineno']. | ["_sdc_source_file"] |
encoding | Character encoding of the file (default: utf-8). | utf-8 |
sanitize_header | Clean up header names for SQL compatibility (default: False). | True or False |
skip_rows | Number of header or metadata rows to skip at the top of the file (default: 0). | 0 |
auto_generate_field_names(Pro only) | Automatically generate column names (col_0, col_1, etc.) if headers are missing (default: false). | true or false |
schema_discovery_sample_rows(Pro only) | Number of rows to sample to infer schema (default: 100). Not applicable for encrypted files. | 100 |
Decryption Configuration (Optional)
If your files are encrypted with GPG, specify the following:| Key | Description | Sample Value |
|---|---|---|
SSM_key_name | AWS SSM parameter key name storing the GPG private key passphrase. | SSM_PARAMETER_KEY_NAME |
gnupghome | Path to your .gnupg directory containing GPG keys. | /home/.gnupg |
passphrase | Passphrase for decrypting GPG files. | mysecretpassphrase |
Mage Pro Enhancements 🚀
With Mage Pro, the SFTP source is enhanced for improved performance, usability, and scalability.The following features are available in Mage Pro only:
- Auto-Generate Field Names:
Automatically generate column names (col_0,col_1, …,col_N) when headers are missing by settingauto_generate_field_names: true. - Schema Discovery Sample Rows:
Speed up schema inference by limiting how many rows are sampled withschema_discovery_sample_rows. - Improved Logging:
Enhanced, structured logging for better traceability and debugging. - TableSpec Schema Support:
Define detailed table specifications for schema generation and validation. - Selective Column Extraction:
Write only the selected columns to the schema to reduce size and improve performance. - Better Code Quality and Documentation:
Cleaner, more robust code and fully updated documentation for enterprise use.
What is SFTP?
SFTP (Secure File Transfer Protocol) is a secure method for transferring files over an encrypted SSH connection.It provides encrypted file access, transfer, and management across a reliable, secure channel.
Why Use SFTP with Mage?
- Secure and encrypted: Safely ingest sensitive files over SSH-secured channels.
- Seamless ETL workflows: Automate file ingestion into your data warehouse or lake.
- Support for encrypted files: Easily decrypt GPG-protected files during ingestion.
- High performance: Faster schema discovery, selective column loading, and large file handling.
- Enterprise-ready: Built for scale with advanced logging, security, and performance features.