Class KafkaIngester

java.lang.Object
io.deephaven.kafka.ingest.KafkaIngester

@ScriptApi
public class KafkaIngester
extends Object
An intraday ingester that replicates a Apache Kafka topic to a Deephaven Data Import Server.

The KafkaIngester is designed to work within a Live Query - Merge Server that has already created a DataImportServer. Each KafkaIngester is assigned a topic and a subset of Kafka partitions. Each Kafka partition is mapped to a Deephaven internal partition. The column partition can be set through the constructor, or defaults to DBTimeUtils.currentDateNy().

Automatic partition assignment and rebalancing are not supported, as the Deephaven checkpoint records store the Kafka partition and offset information on disk. Each Kafka ingester instance must uniquely control its checkpoint record, which is incompatible with rebalancing.

If no Deephaven checkpoint record exists for the partition, then an attempt is made to restore the most recently committed offset from the Kafka broker for the given Kafka partition.

  • Field Details

  • Constructor Details

    • KafkaIngester

      public KafkaIngester​(com.fishlib.io.logger.Logger log, DataImportServer dataImportServer, Properties props, String namespace, String tableName, String topic, Function<TableWriter,​ConsumerRecordToTableWriterAdapter> adapterFactory)
      Creates a Kafka ingester for all partitions of a given topic and a Deephaven column partition of DBTimeUtils.currentDateNy().
      Parameters:
      log - a log for output
      dataImportServer - the dataimport server instance to attach our stream processor to
      props - the properties used to create the KafkaConsumer
      namespace - the namespace of the table to replicate to
      tableName - the name of the table to replicate to
      topic - the topic to replicate
      adapterFactory - a function from the TableWriter for an internal partition to a suitable ConsumerRecordToTableWriterAdapter class that handles records produced by the Kafka topic and transforms them into Deephaven rows.
    • KafkaIngester

      public KafkaIngester​(com.fishlib.io.logger.Logger log, DataImportServer dataImportServer, Properties props, String namespace, String tableName, String topic, String columnPartition, Function<TableWriter,​ConsumerRecordToTableWriterAdapter> adapterFactory)
      Creates a Kafka ingester for all partitions of a given topic.
      Parameters:
      log - a log for output
      dataImportServer - the dataimport server instance to attach our stream processor to
      props - the properties used to create the KafkaConsumer
      namespace - the namespace of the table to replicate to
      tableName - the name of the table to replicate to
      topic - the topic to replicate
      columnPartition - the column partition that we will replicate to
      adapterFactory - a function from the TableWriter for an internal partition to a suitable ConsumerRecordToTableWriterAdapter class that handles records produced by the Kafka topic and transforms them into Deephaven rows.
    • KafkaIngester

      public KafkaIngester​(com.fishlib.io.logger.Logger log, DataImportServer dataImportServer, Properties props, String namespace, String tableName, String topic, IntPredicate partitionFilter, Function<TableWriter,​ConsumerRecordToTableWriterAdapter> adapterFactory)
      Creates a Kafka ingester for a given topic with a Deephaven column partition of DBTimeUtils.currentDateNy().
      Parameters:
      log - a log for output
      dataImportServer - the dataimport server instance to attach our stream processor to
      props - the properties used to create the KafkaConsumer
      namespace - the namespace of the table to replicate to
      tableName - the name of the table to replicate to
      topic - the topic to replicate
      partitionFilter - a predicate indicating which partitions we should replicate
      adapterFactory - a function from the TableWriter for an internal partition to a suitable ConsumerRecordToTableWriterAdapter class that handles records produced by the Kafka topic and transforms them into Deephaven rows.
    • KafkaIngester

      public KafkaIngester​(com.fishlib.io.logger.Logger log, DataImportServer dataImportServer, Properties props, String namespace, String tableName, String topic, IntPredicate partitionFilter, String columnPartition, Function<TableWriter,​ConsumerRecordToTableWriterAdapter> adapterFactory)
      Creates a Kafka ingester for the given topic.
      Parameters:
      log - a log for output
      dataImportServer - the dataimport server instance to attach our stream processor to
      props - the properties used to create the KafkaConsumer
      namespace - the namespace of the table to replicate to
      tableName - the name of the table to replicate to
      topic - the topic to replicate
      partitionFilter - a predicate indicating which partitions we should replicate
      columnPartition - the column partition that we will replicate to
      adapterFactory - a function from the TableWriter for an internal partition to a suitable ConsumerRecordToTableWriterAdapter class that handles records produced by the Kafka topic and transforms them into Deephaven rows.
    • KafkaIngester

      public KafkaIngester​(com.fishlib.io.logger.Logger log, DataImportServer dataImportServer, Properties props, String namespace, String tableName, String topic, IntPredicate partitionFilter, String columnPartition, Function<TableWriter,​ConsumerRecordToTableWriterAdapter> adapterFactory, UnaryOperator<String> resumeFrom)
      Creates a Kafka ingester for the given topic.
      Parameters:
      log - a log for output
      dataImportServer - the dataimport server instance to attach our stream processor to
      props - the properties used to create the KafkaConsumer
      namespace - the namespace of the table to replicate to
      tableName - the name of the table to replicate to
      topic - the topic to replicate
      partitionFilter - a predicate indicating which partitions we should replicate
      columnPartition - the column partition that we will replicate to
      adapterFactory - a function from the TableWriter for an internal partition to a suitable ConsumerRecordToTableWriterAdapter class that handles records produced by the Kafka topic and transforms them into Deephaven rows.
      resumeFrom - Given a column partition value, determine the prior column partition that we should read a checkpoint record to resume from if we do not have our own checkpoint record.
  • Method Details

    • start

      @ScriptApi public void start()
      Starts a consumer thread which replicates the consumed Kafka messages to Deephaven.

      This method must not be called more than once on an ingester instance.