Kafka
Basic config
Offset Configuration
Available in versions >= 0.9.61
The offset
config allows you to reset offset when starting the streaming pipeline. If using the offset
config, the partitions
config is required.
offset
config has 4 optional values:
- beginning
- end
- int
- timestamp
Both beginning
and end
set the consumer to consume data from the beginning and end of the queue, respectively.
Also, they do not require offset_value
int
config set the consumer to consume data from the given offset value inside the queue.
This value correspond to the numeric position inside each partition.
timestamp
config set the consumer to consume data from the given offset timestamp value inside the queue.
This value correspond to the timestamp of the message (Unit should be milliseconds since beginning of the epoch)
SSL authentication
SASL authentication
SASL_PLAINTEXT config:
SASL_SSL config:
Data format
By default, if include_metadata
is false, the kafka data loader returns data from value field, e.g.
Kafka supports structuring and partitioning your data.
If include_metadata
is set to true, the kafka data loader returns these elements as messages with data = {key: value}
and metadata
with (topic, partition, offset and time), e.g.
Deserialize message with protobuf schema
- Specify the
serialization_method
toPROTOBUF
. - Set the
schema_classpath
to the path to the Python schema class. Test whether you have access the the schema with the code in a scratchpad.
Pass raw message to transformer
Deserialize message with Avro schema in Confluent schema registry
API version
In case you are using newer versions of Kafka brokers, you should consider using corresponding Kafka api_version.
Was this page helpful?