Class RowBatch
java.lang.Object
com.illumon.iris.db.tables.dataimport.RowBatch
Data structure to accumulate batches of binary row data in per-column arrays for ingestion.
Note that the various *Column.read()
methods are allowed to assume that they will always use the same data
buffer with the same contents for an entire batch of rows. We always flush before changing buffers, compacting data,
or re-filling the buffer.
Note that the design goal here is to minimize virtual dispatch, allocation, and overhead while allowing for vectorizable transformations and bulk inserts.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionfinal class
RowBatch.RawColumn
subclass that represents raw byte data presented asByteBuffer
instances.final class
final class
RowBatch.Column
subclass that decodes and stores primitivebyte
s.final class
RowBatch.Column
subclass that decodes and stores primitivechar
s.class
final class
RowBatch.Column
subclass that decodes and stores primitivedouble
s.final class
final class
RowBatch.Column
subclass that decodes and stores enumeratedString
columns as primitiveint
ordinals.final class
RowBatch.Column
subclass that decodes and stores primitivefloat
s.final class
RowBatch.Column
subclass that decodes and stores primitiveint
s.final class
RowBatch.Column
subclass that decodes and stores primitivelong
s.final class
RowBatch.Column
subclass that decodes and stores primitiveshort
s.static interface
Common interface forRowBatch.EnhancedStringColumn
andRowBatch.EnumColumn
to allow for runtime polymorphism based on binary store header info. -
Constructor Summary
ConstructorsConstructorDescriptionRowBatch
(int capacity, BinaryStoreV2HeaderInfo headerInfo) Make a row batch for the supplied header info and capacity. -
Method Summary
Modifier and TypeMethodDescriptionstatic boolean
columnStateless
(SupportedType supportedType) Check whether thisSupportedType
is guaranteed to have a statelessRowBatch.Column
that is thus safe for concurrent use.boolean
empty()
Check if this row batch is empty and hasn't read any rows.boolean
full()
Check if this row batch is full and can't read any more rows.<CT> CT
Get a column for use by the row buffer reader or processor.static Class<?>
getColumnClass
(SupportedType supportedType) Get the class (or an upper bound) ofRowBatch.Column
implementations for aSupportedType
.Get a collection of all the column names.boolean
isCompatible
(int capacity, BinaryStoreV2HeaderInfo headerInfo) Test whether this row batch can support the suppliedcapacity
andheaderInfo
.int
Get the length available to read, which isreadLimit() - readPosition()
.int
Get the read limit, which is the index of the first row that should not be read in the next operation to consume data from this row batch.int
Get the read position, which is the index of the first row that should be read in the next operation to consume data from this row batch.void
readRow()
Read a row from the last data buffer set, which should positioned to begin reading the presence bitmap.void
reset()
Reset for the next batch, after consuming data from theRowBatch.Column
s.void
setDataBuffer
(ByteBuffer dataBuffer) Advise this row batch of the incoming data buffer to be used in subsequent calls toreadRow()
.void
setReadBoundaries
(int readPosition, int readLimit) Set the read position and read limit.int
size()
Get the size of the current row batch.
-
Constructor Details
-
RowBatch
Make a row batch for the supplied header info and capacity.- Parameters:
capacity
- The capacityheaderInfo
- The header info
-
-
Method Details
-
isCompatible
Test whether this row batch can support the suppliedcapacity
andheaderInfo
.- Parameters:
capacity
- The capacityheaderInfo
- The header info- Returns:
- Whether this row batch can support the desired inputs
-
getColumn
Get a column for use by the row buffer reader or processor.- Parameters:
name
- The column nameclazz
- The expected column class- Returns:
- The column
-
getColumns
Get a collection of all the column names. This is used for error reporting.- Returns:
- all known column names
-
setDataBuffer
Advise this row batch of the incoming data buffer to be used in subsequent calls toreadRow()
. This row batch should record this information, and may need to clear cached objects (e.g. duplicate buffers) linked to the buffer. This is only valid on an "empty" row buffer (i.e. when size is 0). Invoking withnull
allows the current data buffer to be forgotten, but a new one must be set before subsequent reads.- Parameters:
dataBuffer
- The data buffer
-
size
public int size()Get the size of the current row batch. This is the number of complete rows successfully read usingreadRow()
since the lastreset()
.- Returns:
- The size of the current row batch
-
empty
public boolean empty()Check if this row batch is empty and hasn't read any rows.- Returns:
- Whether this row batch is empty
-
full
public boolean full()Check if this row batch is full and can't read any more rows.- Returns:
- Whether this row batch is full
-
readRow
public void readRow()Read a row from the last data buffer set, which should positioned to begin reading the presence bitmap. -
reset
public void reset()Reset for the next batch, after consuming data from theRowBatch.Column
s. -
readPosition
public int readPosition()Get the read position, which is the index of the first row that should be read in the next operation to consume data from this row batch.- Returns:
- The read position
-
readLimit
public int readLimit()Get the read limit, which is the index of the first row that should not be read in the next operation to consume data from this row batch.- Returns:
- The read limit
-
readLength
public int readLength()Get the length available to read, which isreadLimit() - readPosition()
.- Returns:
- The read length
-
setReadBoundaries
public void setReadBoundaries(int readPosition, int readLimit) Set the read position and read limit. The read position is the index of the first row that should be read, and the read limit is the index of the first row that should not be read.It's required that
readPosition <= readLimit <= size
.- Parameters:
readPosition
- The read positionreadLimit
- The read limit
-
columnStateless
Check whether thisSupportedType
is guaranteed to have a statelessRowBatch.Column
that is thus safe for concurrent use.- Parameters:
supportedType
- TheSupportedType
- Returns:
- Whether the appropriate
RowBatch.Column
is guaranteed to be stateless and safe for concurrent use
-
getColumnClass
Get the class (or an upper bound) ofRowBatch.Column
implementations for aSupportedType
.- Parameters:
supportedType
- TheSupportedType
- Returns:
- The appropriate class
-