Class RowBatch

java.lang.Object
com.illumon.iris.db.tables.dataimport.RowBatch

public final class RowBatch
extends Object
Data structure to accumulate batches of binary row data in per-column arrays for ingestion.

Note that the various *Column.read() methods are allowed to assume that they will always use the same data buffer with the same contents for an entire batch of rows. We always flush before changing buffers, compacting data, or re-filling the buffer.

Note that the design goal here is to minimize virtual dispatch, allocation, and overhead while allowing for vectorizable transformations and bulk inserts.

  • Constructor Details

    • RowBatch

      public RowBatch​(int capacity, @NotNull BinaryStoreV2HeaderInfo headerInfo)
      Make a row batch for the supplied header info and capacity.
      Parameters:
      capacity - The capacity
      headerInfo - The header info
  • Method Details

    • isCompatible

      public boolean isCompatible​(int capacity, @NotNull BinaryStoreV2HeaderInfo headerInfo)
      Test whether this row batch can support the supplied capacity and headerInfo.
      Parameters:
      capacity - The capacity
      headerInfo - The header info
      Returns:
      Whether this row batch can support the desired inputs
    • getColumn

      public <CT> CT getColumn​(@NotNull String name, @NotNull Class<CT> clazz)
      Get a column for use by the row buffer reader or processor.
      Parameters:
      name - The column name
      clazz - The expected column class
      Returns:
      The column
    • getColumns

      public Collection<String> getColumns()
      Get a collection of all the column names. This is used for error reporting.
      Returns:
      all known column names
    • setDataBuffer

      public void setDataBuffer​(ByteBuffer dataBuffer)
      Advise this row batch of the incoming data buffer to be used in subsequent calls to readRow(). This row batch should record this information, and may need to clear cached objects (e.g. duplicate buffers) linked to the buffer. This is only valid on an "empty" row buffer (i.e. when size is 0). Invoking with null allows the current data buffer to be forgotten, but a new one must be set before subsequent reads.
      Parameters:
      dataBuffer - The data buffer
    • size

      public int size()
      Get the size of the current row batch. This is the number of complete rows successfully read using readRow() since the last reset().
      Returns:
      The size of the current row batch
    • empty

      public boolean empty()
      Check if this row batch is empty and hasn't read any rows.
      Returns:
      Whether this row batch is empty
    • full

      public boolean full()
      Check if this row batch is full and can't read any more rows.
      Returns:
      Whether this row batch is full
    • readRow

      public void readRow()
      Read a row from the last data buffer set, which should positioned to begin reading the presence bitmap.
    • reset

      public void reset()
      Reset for the next batch, after consuming data from the RowBatch.Columns.
    • readPosition

      public int readPosition()
      Get the read position, which is the index of the first row that should be read in the next operation to consume data from this row batch.
      Returns:
      The read position
    • readLimit

      public int readLimit()
      Get the read limit, which is the index of the first row that should not be read in the next operation to consume data from this row batch.
      Returns:
      The read limit
    • readLength

      public int readLength()
      Get the length available to read, which is readLimit() - readPosition().
      Returns:
      The read length
    • setReadBoundaries

      public void setReadBoundaries​(int readPosition, int readLimit)
      Set the read position and read limit. The read position is the index of the first row that should be read, and the read limit is the index of the first row that should not be read.

      It's required that readPosition <= readLimit <= size.

      Parameters:
      readPosition - The read position
      readLimit - The read limit
    • columnStateless

      public static boolean columnStateless​(@NotNull SupportedType supportedType)
      Check whether this SupportedType is guaranteed to have a stateless RowBatch.Column that is thus safe for concurrent use.
      Parameters:
      supportedType - The SupportedType
      Returns:
      Whether the appropriate RowBatch.Column is guaranteed to be stateless and safe for concurrent use
    • getColumnClass

      public static Class<?> getColumnClass​(@NotNull SupportedType supportedType)
      Get the class (or an upper bound) of RowBatch.Column implementations for a SupportedType.
      Parameters:
      supportedType - The SupportedType
      Returns:
      The appropriate class