Class RowBatch
java.lang.Object
com.illumon.iris.db.tables.dataimport.RowBatch
public final class RowBatch extends Object
Data structure to accumulate batches of binary row data in per-column arrays for ingestion.
Note that the various *Column.read()
methods are allowed to assume that they will always use the same data
buffer with the same contents for an entire batch of rows. We always flush before changing buffers, compacting data,
or re-filling the buffer.
Note that the design goal here is to minimize virtual dispatch, allocation, and overhead while allowing for vectorizable transformations and bulk inserts.
-
Nested Class Summary
Nested Classes Modifier and Type Class Description class
RowBatch.BlobColumn
RowBatch.RawColumn
subclass that represents raw byte data presented asByteBuffer
instances.class
RowBatch.BooleanColumn
class
RowBatch.ByteColumn
RowBatch.Column
subclass that decodes and stores primitivebyte
s.class
RowBatch.CharColumn
RowBatch.Column
subclass that decodes and stores primitivechar
s.class
RowBatch.Column
class
RowBatch.DoubleColumn
RowBatch.Column
subclass that decodes and stores primitivedouble
s.class
RowBatch.EnhancedStringColumn
class
RowBatch.EnumColumn
RowBatch.Column
subclass that decodes and stores enumeratedString
columns as primitiveint
ordinals.class
RowBatch.FloatColumn
RowBatch.Column
subclass that decodes and stores primitivefloat
s.class
RowBatch.IntColumn
RowBatch.Column
subclass that decodes and stores primitiveint
s.class
RowBatch.LongColumn
RowBatch.Column
subclass that decodes and stores primitivelong
s.class
RowBatch.ShortColumn
RowBatch.Column
subclass that decodes and stores primitiveshort
s.static interface
RowBatch.StringColumnAccessor
Common interface forRowBatch.EnhancedStringColumn
andRowBatch.EnumColumn
to allow for runtime polymorphism based on binary store header info. -
Constructor Summary
Constructors Constructor Description RowBatch(int capacity, BinaryStoreV2HeaderInfo headerInfo)
Make a row batch for the supplied header info and capacity. -
Method Summary
Modifier and Type Method Description static boolean
columnStateless(SupportedType supportedType)
Check whether thisSupportedType
is guaranteed to have a statelessRowBatch.Column
that is thus safe for concurrent use.boolean
empty()
Check if this row batch is empty and hasn't read any rows.boolean
full()
Check if this row batch is full and can't read any more rows.<CT> CT
getColumn(String name, Class<CT> clazz)
Get a column for use by the row buffer reader or processor.static Class<?>
getColumnClass(SupportedType supportedType)
Get the class (or an upper bound) ofRowBatch.Column
implementations for aSupportedType
.Collection<String>
getColumns()
Get a collection of all the column names.boolean
isCompatible(int capacity, BinaryStoreV2HeaderInfo headerInfo)
Test whether this row batch can support the suppliedcapacity
andheaderInfo
.int
readLength()
Get the length available to read, which isreadLimit() - readPosition()
.int
readLimit()
Get the read limit, which is the index of the first row that should not be read in the next operation to consume data from this row batch.int
readPosition()
Get the read position, which is the index of the first row that should be read in the next operation to consume data from this row batch.void
readRow()
Read a row from the last data buffer set, which should positioned to begin reading the presence bitmap.void
reset()
Reset for the next batch, after consuming data from theRowBatch.Column
s.void
setDataBuffer(ByteBuffer dataBuffer)
Advise this row batch of the incoming data buffer to be used in subsequent calls toreadRow()
.void
setReadBoundaries(int readPosition, int readLimit)
Set the read position and read limit.int
size()
Get the size of the current row batch.
-
Constructor Details
-
RowBatch
Make a row batch for the supplied header info and capacity.- Parameters:
capacity
- The capacityheaderInfo
- The header info
-
-
Method Details
-
isCompatible
Test whether this row batch can support the suppliedcapacity
andheaderInfo
.- Parameters:
capacity
- The capacityheaderInfo
- The header info- Returns:
- Whether this row batch can support the desired inputs
-
getColumn
Get a column for use by the row buffer reader or processor.- Parameters:
name
- The column nameclazz
- The expected column class- Returns:
- The column
-
getColumns
Get a collection of all the column names. This is used for error reporting.- Returns:
- all known column names
-
setDataBuffer
Advise this row batch of the incoming data buffer to be used in subsequent calls toreadRow()
. This row batch should record this information, and may need to clear cached objects (e.g. duplicate buffers) linked to the buffer. This is only valid on an "empty" row buffer (i.e. when size is 0). Invoking withnull
allows the current data buffer to be forgotten, but a new one must be set before subsequent reads.- Parameters:
dataBuffer
- The data buffer
-
size
public int size()Get the size of the current row batch. This is the number of complete rows successfully read usingreadRow()
since the lastreset()
.- Returns:
- The size of the current row batch
-
empty
public boolean empty()Check if this row batch is empty and hasn't read any rows.- Returns:
- Whether this row batch is empty
-
full
public boolean full()Check if this row batch is full and can't read any more rows.- Returns:
- Whether this row batch is full
-
readRow
public void readRow()Read a row from the last data buffer set, which should positioned to begin reading the presence bitmap. -
reset
public void reset()Reset for the next batch, after consuming data from theRowBatch.Column
s. -
readPosition
public int readPosition()Get the read position, which is the index of the first row that should be read in the next operation to consume data from this row batch.- Returns:
- The read position
-
readLimit
public int readLimit()Get the read limit, which is the index of the first row that should not be read in the next operation to consume data from this row batch.- Returns:
- The read limit
-
readLength
public int readLength()Get the length available to read, which isreadLimit() - readPosition()
.- Returns:
- The read length
-
setReadBoundaries
public void setReadBoundaries(int readPosition, int readLimit)Set the read position and read limit. The read position is the index of the first row that should be read, and the read limit is the index of the first row that should not be read.It's required that
readPosition <= readLimit <= size
.- Parameters:
readPosition
- The read positionreadLimit
- The read limit
-
columnStateless
Check whether thisSupportedType
is guaranteed to have a statelessRowBatch.Column
that is thus safe for concurrent use.- Parameters:
supportedType
- TheSupportedType
- Returns:
- Whether the appropriate
RowBatch.Column
is guaranteed to be stateless and safe for concurrent use
-
getColumnClass
Get the class (or an upper bound) ofRowBatch.Column
implementations for aSupportedType
.- Parameters:
supportedType
- TheSupportedType
- Returns:
- The appropriate class
-