Class RspArray<T extends RspArray>
- Direct Known Subclasses:
RspBitmap
A set representation for long values using Regular Space Partitioning (RSP) of the long space in "blocks" of (2^16) elements.
Modeled heavily after roaringbitmap.RoaringArray (keeping API method names and semantics as much as possible), with modifications for:
- Full "unsigned long" 64 bit range (as opposed to 32 bit in RoaringArray).
- Spans of all bits set ("AllSet") that can be arbitrarily big (ie, not constrained to 2^16 = RB Container size).
The handling of unsigned values follows RB; ie, key values are compared/sorted as unsigned longs.
Definitions:
- A "block" is a particular interval [n*2^16, (n+1)*2^16 - 1] of the long domain.
- A "span" is a partition of the domain consisting of one or more consecutive blocks; a span is a subset of the domain represented by an interval [n*2^16, (n+m)*2^16 - 1], m >= 1.
- Full blocks are blocks whose domain are fully contained in the set, ie, the set contains every possible value in the block's interval (as a bitmap, it would be "all ones").
- Spans of full blocks are represented by a single "full blocks span" object (just a Long) which knows how many 2^16 ranges it has (it's "full blocks span len" ("flen") is the number of full blocks in the span).
- Individual blocks that are not completely full are stored in an RB Container; their "full blocks span len" is zero.
Our internal representation uses two parallel arrays:
- a
long[] spanInfosarray that contains the information for the offset to the values in the span, which we call the span's "key". For instance, a full block span that represents all the long values in [65536, 196607] has as its key the value 65536. - an
Object[] spansarray that contains the actual spans. At the most basic level, a span can be either a full block span or a container of values (but there is nuance in exactly how to represent them, see below).
We use several optimizations to reduce memory utilization for sparse sets. Details follow.
The long[] spanInfos and Object[] spans data members of this class are used, combined,
to represent the offset (key) and span values in the set, against that offset.
The two arrays are used, together, as parallel arrays and the information for a given conceptual
span is contained in both of them for the same corresponding index i.
There are two basic cases for a span: it is either a full blocks span, containing a >=1 number of full blocks, or it is a container, containing individual values in the particular 2^16 block corresponding to the span's key.
There are four ways total that these two cases can be represented between the long in the `spanInfos` array and the Object in the `spans` array. Given a span at position `i`:
- If the corresponding
Object spans[i]is of type Long, then thelong spanInfos[i]value is the key for the span (with its lower 16 bits as zero), and the Long value represents how many full blocks are present. Example, the set [ 0, 2^50 - 1 ] is represented as spanInfo==0 and span==Long(2^34). - As an optimization to conserve memory, if the Object spans[i] is the Object reference with value
FULL_BLOCK_SPAN_MARKER(a singleton and final marker Object defined statically in this file). then the upper 48 bits of thelong spanInfo[i]value represent the key for the span, and the lower 16 bits of thelong spanInfo[i]value represent the full block span length. Example, the set [ 65536, 196607 ] is represented byspanInfo==65538andspan==FULL_BLOCK_SPAN_MARKER(note196607 == 65536*3 - 1, so the set is 2 full blocks, and65538 == 65536 | 2. - If the corresponding
Object spans[i]is null, then thelong spanInfos[i]represents the single value present in the span (note in this case, its upper 16 bits still corresponds to its key). Example, the set { 65537 } is represented by spanInfo==65537 and span==null. - If the corresponding
Object spans[i]is of typeshort[]or of typeContainer, then it represents a container of multiple values in a single block (but not all of the possible values in the block, since in that case it would be a full block span as above). In this case the higher 48 bits of its corresponding spanInfo represent the key for the span. Depending on the actual type of span there are two subcases:- If
spans[i]is of typeContainer, then the values in the roaringbitmaps container object are part of the set, considered against its key offset. The key is represented in the higher 48 bits of its corresponding spaninfo. The lower 16 bits of spanInfo are zero in this case. Example, the set [ 100,000-100,010, 100,020-100,030 ] is represented byspaInfo==65536,span==RunContainer({34464-34474, 34484-34494}) - If
spans[i]is of typeshort[], then anArrayContainerwith theshort[]contents needs to be reconstructed. The lower 16 bits of the spanInfo value are used to represent the other data members of ArrayContainer. This case exists as an optimization to reduce memory utilization for sparse blocks. For details of this reconstruction please see the code for the definition of the SpanView class below.
- If
Notes:
- Our version of RB Container supports a "shared" boolean flag that is used to implement copy-on-write (COW) semantics and allow operation results to share containers in COW fashion.
- We extended the Container class hierarchy to include specializations for empty, single value, single range, and two values containers. These are immutable; empty is used only as a way to return empty results, and are never actual stored in the spans array. For details, please see the Container class definition and derived class hierarchy.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprotected static classstatic interfacestatic interfaceprotected static final class -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intstatic final intstatic final intstatic final booleanstatic final Objectprotected long[]Array of keys (in the long's higher 48 bits) and other span data (in the long's lower 16 bits) parallel to the spans array, mapping the long value in a given array position to the corresponding span in the same position.protected Object[]Array of Spans parallel to the spanInfos array, mapping the same index to the corresponding span for the spanInfo.protected static final ThreadLocal<com.illumon.iris.db.v2.utils.rsp.RspArray.WorkData> -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidIntersects this RspArray with the argument, leaving the result on this RspArray.voidRemove every element from argument from this RspArray.voidappendContainer(long k, Container c) voidappendFullBlockSpan(long k, long slen) protected voidappendSharedContainer(RspArray other, long otherSpanInfo, Container container) protected voidappendSharedContainerMaybePacked(RspArray other, int otherIdx, long otherSpanInfo, Object otherContainer) voidappendSingletonSpan(long v) protected final voidapplyKeyOffset(int i, long offset) voidapplyKeyOffset(long offset) intbinarySearchKeys(int fromIndex, int toIndex, IndexUtilities.Comparator comp) protected voidcollectRemovedIndicesIfAny(org.apache.commons.lang3.mutable.MutableObject<SortedRanges> madeNullSpansMu) doublebooleancontainsRange(long start, long end) protected voidcopyKeyAndSpanMaybeSharing(long shiftAmount, RspArray src, int srcIdx, long[] dstSpanInfos, Object[] dstSpans, int dstIdx, boolean tryShare) protected voidcopyKeyAndSpanMaybeSharing(RspArray src, int srcIdx, long[] dstSpanInfos, Object[] dstSpans, int dstIdx) protected voidcopyKeyAndSpanStealingContainers(int srcIdx, long[] srcSpanInfos, Object[] srcSpans, int dstIdx, long[] dstSpanInfos, Object[] dstSpans) static longdistanceInBlocks(long blockKeyStart, long blockKeyEnd) blockKeyEnd is exclusive.static longdivBlockSize(long v) protected voidensureSizeCanGrowBy(int n) Make sure there is capacity for at least n more spanslongfind(long val) longlongfirstValueAtIndex(int i) booleanbooleanbooleanforEachLongRangeInSpanWithOffsetAndMaxCardinality(int i, long offset, long maxCardinality, LongRangeAbortableConsumer larc) longget(long pos) longlonggetAverageRunLengthEstimate(int startIdx, int endIdx) longstatic longgetFullBlockSpanLen(long spanInfo, Object span) protected final longgetKey(int i) voidgetKeysForPositions(PrimitiveIterator.OfLong inputPositions, LongConsumer outputKeys) getOrderedKeysByKeyRange(long startValueInclusive, long endValueInclusive) getOrderedKeysByPosition(long startPositionInclusive, long length) static longgetRangeBatchIterator(long initialSeek, long maxCount) protected final longgetSingletonSpanValue(int i) longgetSpanCardinalityAtIndex(int i) longgetSpanCardinalityAtIndex(int i, boolean optimizeContainers) longintgetSpanIndex(int fromIndex, int endIndexExclusive, long key) intgetSpanIndex(int fromIndex, long key) intgetSpanIndex(long key) protected final longgetSpanInfo(int i) protected org.apache.commons.lang3.mutable.MutableObject<SortedRanges>getWorkSortedRangesMutableObject(com.illumon.iris.db.v2.utils.rsp.RspArray.WorkData wd) static longhighBits(long val) voidinsertContainerAtIndex(int i, long key, Container c) Insert a container at position i with key k, pushing the existing elements to the right.voidinsertFullBlockSpanAtIndex(int i, long key, long flen) Insert a full block span at position i with key k, pushing the existing elements to the right.protected voidinsertSharedContainer(int i, RspArray other, long otherSpanInfo, Container otherContainer) voidinsertSingletonAtIndex(int i, long value) Insert a new singleton span at position i with key k, pushing the existing elements to the right.static booleanbooleanisEmpty()static booleanprotected static booleanlonglongintkeySearch(int startPos, int endPosExclusive, long key) intkeySearch(int startPos, long key) longstatic shortlowBits(long val) static intlowBitsAsInt(long val) protected abstract Tmake()protected abstract Tprotected voidmarkIndexAsRemoved(org.apache.commons.lang3.mutable.MutableObject<SortedRanges> madeNullSpansMu, int index) protected voidmarkIndexRangeAsRemoved(org.apache.commons.lang3.mutable.MutableObject<SortedRanges> madeNullSpansMu, int iFirst, int iLast) static intmodBlockSize(long v) static longnextKey(long key) voidorEqualsShiftedUnsafeNoWriteCheck(long shiftAmount, RspArray other) For every element in other, add (element + shiftAmount) to this RspArray.voidFor every element in other, add element to this RspArray.booleanintoverlapsRange(int iStart, long start, long end) booleanoverlapsRange(long first, long last) Returns true if any value in this RspArray is contained inside the range [first, last], false otherwise.static longpaste(long highBits, short lowBits) longlongrangesCountUpperBound(int startIdx, int endIdx) voidvoidremoveRangeUnsafeNoWriteCheck(long start, long end) voidremoveSpanAtIndex(int i) voidreplaceSpanAtIndex(int i, RspArray.ArraysBuf buf) Replace the span at index i with the keys and spans from buf,voidsampleMetrics(LongConsumer rspParallelArraysSizeUsed, LongConsumer rspParallelArraysSizeUnused, LongConsumer arrayContainersBytesAllocated, LongConsumer arrayContainersBytesUnused, LongConsumer arrayContainersCardinality, LongConsumer arrayContainersCount, LongConsumer bitmapContainersBytesAllocated, LongConsumer bitmapContainersBytesUnused, LongConsumer bitmapContainersCardinality, LongConsumer bitmapContainersCount, LongConsumer runContainersBytesAllocated, LongConsumer runContainersBytesUnused, LongConsumer runContainersCardinality, LongConsumer runContainersCount, LongConsumer runContainersRunsCount, LongConsumer singleRangeContainersCount, LongConsumer singleRangeContainerCardinality, LongConsumer singletonContainersCount, LongConsumer twoValuesContainerCount, LongConsumer rpsFullBlockSpansCount, LongConsumer rspFullBlockSpanLen) intsearchSpanIndex(int startPos, IndexUtilities.Comparator comp) protected voidsetContainerSpan(int i, long key, Container c) protected voidsetContainerSpan(Container oldContainer, int i, long key, Container newContainer) protected voidsetContainerSpanRaw(int i, long key, Container container) protected static voidsetContainerSpanRaw(long[] spanInfos, Object[] spans, int i, long key, Container container) protected voidsetFullBlockSpan(int i, long key, long flen) protected static voidsetFullBlockSpanRaw(int i, long[] spanInfos, Object[] spans, long key, long flen) protected voidsetFullBlockSpanRaw(int i, long key, long flen) voidsetLastFullBlockSpan(long k, long slen) intsetOrInsertFullBlockSpanAtIndex(int newSpanIdx, long newSpanKey, long newSpanFlen, org.apache.commons.lang3.mutable.MutableObject<SortedRanges> madeNullSpansMu) protected voidsetSharedContainerMaybePackedRaw(int i, RspArray src, int srcIdx, long srcSpanInfo, Object srcContainer) protected voidsetSharedContainerRaw(int i, RspArray other, long key, Container container) protected final voidsetSingletonSpan(int i, long value) protected final voidsetSingletonSpanRaw(int i, long value) protected booleanintsize()protected static longspanInfoToKey(long spanInfo) protected static longspanInfoToSingletonSpanValue(long spanInfo) protected TsubrangeByKeyInternal(long start, long end) protected TsubrangeByPosInternal(long firstPos, long lastPos) booleantoString()protected final TreeIndexImplvoidtryCompact(int compactFactor) voidtryCompactUnsafe(int compactFactor) static booleanuGreater(long k1, long k2) static booleanuGreaterOrEqual(long k1, long k2) static booleanuLess(long k1, long k2) static booleanuLessOrEqual(long k1, long k2) static longuMax(long k1, long k2) static longuMin(long k1, long k2) static intunsignedShortToInt(short x) static longunsignedShortToLong(short x) Methods inherited from class com.illumon.iris.db.v2.utils.RefCountedCow
acquire, canWrite, cowRef, deepCopy, getWriteRef, notifyAfterRelease, notifyBeforeAcquire, refCount, release, self
-
Field Details
-
debug
public static final boolean debug -
BLOCK_SIZE
public static final int BLOCK_SIZE- See Also:
-
BLOCK_LAST
public static final int BLOCK_LAST- See Also:
-
BITS_PER_BLOCK
public static final int BITS_PER_BLOCK -
spanInfos
protected long[] spanInfosArray of keys (in the long's higher 48 bits) and other span data (in the long's lower 16 bits) parallel to the spans array, mapping the long value in a given array position to the corresponding span in the same position. Please see the documentation for this class for details of the different cases for the lower 16 bits, depending on the type of span. Values are kept in unsigned sorted order according to higher 16 bits to enable binary search of keys. -
spans
Array of Spans parallel to the spanInfos array, mapping the same index to the corresponding span for the spanInfo. Please see the documentation for this class for the different possible types allowed and their meanings. -
FULL_BLOCK_SPAN_MARKER
-
workDataPerThread
protected static final ThreadLocal<com.illumon.iris.db.v2.utils.rsp.RspArray.WorkData> workDataPerThread
-
-
Constructor Details
-
RspArray
-
RspArray
-
-
Method Details
-
make
-
make
-
highBits
public static long highBits(long val) -
lowBits
public static short lowBits(long val) -
lowBitsAsInt
public static int lowBitsAsInt(long val) -
divBlockSize
public static long divBlockSize(long v) -
modBlockSize
public static int modBlockSize(long v) -
getSpanInfo
protected final long getSpanInfo(int i) -
spanInfoToKey
protected static long spanInfoToKey(long spanInfo) -
getKey
protected final long getKey(int i) -
getSingletonSpanValue
protected final long getSingletonSpanValue(int i) -
spanInfoToSingletonSpanValue
protected static long spanInfoToSingletonSpanValue(long spanInfo) -
setSingletonSpanRaw
protected final void setSingletonSpanRaw(int i, long value) -
setSingletonSpan
protected final void setSingletonSpan(int i, long value) -
applyKeyOffset
protected final void applyKeyOffset(int i, long offset) -
setFullBlockSpanRaw
protected void setFullBlockSpanRaw(int i, long key, long flen) -
setFullBlockSpanRaw
protected static void setFullBlockSpanRaw(int i, long[] spanInfos, Object[] spans, long key, long flen) -
setFullBlockSpan
protected void setFullBlockSpan(int i, long key, long flen) -
getPackedInfoLowBits
-
setContainerSpanRaw
-
setContainerSpanRaw
-
copyKeyAndSpanStealingContainers
-
copyKeyAndSpanMaybeSharing
-
copyKeyAndSpanMaybeSharing
-
setContainerSpan
-
setContainerSpan
-
isSingletonSpan
-
size
public int size() -
isEmpty
public boolean isEmpty() -
firstValue
public long firstValue() -
firstValueAtIndex
public long firstValueAtIndex(int i) -
keyForFirstBlock
public long keyForFirstBlock() -
keyForLastBlock
public long keyForLastBlock() -
lastValue
public long lastValue() -
getRangeIterator
-
getRangeBatchIterator
-
getIterator
-
getReverseIterator
-
ensureSizeCanGrowBy
protected void ensureSizeCanGrowBy(int n) Make sure there is capacity for at least n more spans -
tryCompactUnsafe
public void tryCompactUnsafe(int compactFactor) - Parameters:
compactFactor- if k == 0, compact if count < capacity. k > 0, compact if (capacity - count > (capacity >> k).
-
tryCompact
public void tryCompact(int compactFactor) -
keySearch
public int keySearch(int startPos, long key) -
keySearch
public int keySearch(int startPos, int endPosExclusive, long key) -
isFullBlockSpan
-
isContainer
-
getFullBlockSpanLen
-
getSpanIndex
public int getSpanIndex(long key) - Returns:
- if the key is included in some existing span, returns the index of that span. if the key is not included in any existing span, returns -(p - 1) where p is the position a span for the key would be inserted. Note that, since a span's covered interval may include multiple blocks, a key contained by a span may be different than its first key (if the span includes more than one block).
-
getSpanIndex
public int getSpanIndex(int fromIndex, long key) -
getSpanIndex
public int getSpanIndex(int fromIndex, int endIndexExclusive, long key) -
searchSpanIndex
-
getSpanCardinalityAtIndexMaybeAcc
public long getSpanCardinalityAtIndexMaybeAcc(int i) -
getSpanCardinalityAtIndex
public long getSpanCardinalityAtIndex(int i) -
getSpanCardinalityAtIndex
public long getSpanCardinalityAtIndex(int i, boolean optimizeContainers) -
getCardinality
public long getCardinality() -
nextKey
public static long nextKey(long key) -
distanceInBlocks
public static long distanceInBlocks(long blockKeyStart, long blockKeyEnd) blockKeyEnd is exclusive. Assumption on entry: blockKeyStart <= blockKeyEnd- Parameters:
blockKeyStart- inclusive start block key (only high 48 bits set).blockKeyEnd- exclusive end block key (only high 48 bits set).- Returns:
- distance in blocks between blockKeyStart and blockKeyEnd
-
setOrInsertFullBlockSpanAtIndex
public int setOrInsertFullBlockSpanAtIndex(int newSpanIdx, long newSpanKey, long newSpanFlen, org.apache.commons.lang3.mutable.MutableObject<SortedRanges> madeNullSpansMu) - Parameters:
newSpanIdx- an index, as returned by getSpanAtIndex(k). Note this can be negative, in which case this is an insertion (existing elements pushed to the right as necessary).newSpanKey- the key.newSpanFlen- the number of 2^16 intervals.- Returns:
- the (positive) index where the span was actually inserted.
-
setLastFullBlockSpan
public void setLastFullBlockSpan(long k, long slen) -
appendSingletonSpan
public void appendSingletonSpan(long v) -
appendContainer
-
appendFullBlockSpan
public void appendFullBlockSpan(long k, long slen) -
insertFullBlockSpanAtIndex
public void insertFullBlockSpanAtIndex(int i, long key, long flen) Insert a full block span at position i with key k, pushing the existing elements to the right. The caller should ensure that the key order is preserved by this operation.- Parameters:
i- position in which to insertkey- key for the span to be insertedflen- the full block len for the span inserted.
-
insertSingletonAtIndex
public void insertSingletonAtIndex(int i, long value) Insert a new singleton span at position i with key k, pushing the existing elements to the right. The caller should ensure that the key order is preserved by this operation.- Parameters:
i- position in which to insertvalue- the singleton value for the span to be inserted
-
insertContainerAtIndex
Insert a container at position i with key k, pushing the existing elements to the right. The caller should ensure that the key order is preserved by this operation.- Parameters:
i- position in which to insertkey- key for the span to be insertedc- the container to be inserted
-
removeSpanAtIndex
public void removeSpanAtIndex(int i) -
replaceSpanAtIndex
Replace the span at index i with the keys and spans from buf, -
binarySearchKeys
-
unsignedShortToLong
public static long unsignedShortToLong(short x) -
unsignedShortToInt
public static int unsignedShortToInt(short x) -
paste
public static long paste(long highBits, short lowBits) -
get
public long get(long pos) -
getKeysForPositions
-
find
public long find(long val) -
subsetOf
-
containsRange
public boolean containsRange(long start, long end) -
overlaps
-
overlapsRange
public boolean overlapsRange(long first, long last) Returns true if any value in this RspArray is contained inside the range [first, last], false otherwise.- Parameters:
first- First value in the range to checklast- Last value in the range to check- Returns:
-
overlapsRange
public int overlapsRange(int iStart, long start, long end) -
orEqualsUnsafeNoWriteCheck
For every element in other, add element to this RspArray. The argument won't be modified (with the possible exclusion of sharing some of its containers Copy On Write).- Parameters:
other- the RspArray to add to this.
-
orEqualsShiftedUnsafeNoWriteCheck
For every element in other, add (element + shiftAmount) to this RspArray. Note shiftAmount is assumed to be a multiple of BLOCK_SIZE. The argument won't be modified (with the possible exclusion of sharing some of its containers Copy On Write).- Parameters:
shiftAmount- the amount to add to each key in other before insertionother- the base keys to add in the (key + shiftAmount) formula for insertion.
-
markIndexAsRemoved
protected void markIndexAsRemoved(org.apache.commons.lang3.mutable.MutableObject<SortedRanges> madeNullSpansMu, int index) -
markIndexRangeAsRemoved
protected void markIndexRangeAsRemoved(org.apache.commons.lang3.mutable.MutableObject<SortedRanges> madeNullSpansMu, int iFirst, int iLast) -
collectRemovedIndicesIfAny
protected void collectRemovedIndicesIfAny(org.apache.commons.lang3.mutable.MutableObject<SortedRanges> madeNullSpansMu) -
getWorkSortedRangesMutableObject
protected org.apache.commons.lang3.mutable.MutableObject<SortedRanges> getWorkSortedRangesMutableObject(com.illumon.iris.db.v2.utils.rsp.RspArray.WorkData wd) -
andNotEqualsUnsafeNoWriteCheck
Remove every element from argument from this RspArray. Argument won't be modified.- Parameters:
other- the elements to remove.
-
andEqualsUnsafeNoWriteCheck
Intersects this RspArray with the argument, leaving the result on this RspArray. The argument won't be modified.- Parameters:
other- an RspArray.
-
applyKeyOffset
public void applyKeyOffset(long offset) -
forEachLong
-
forEachLongRangeInSpanWithOffsetAndMaxCardinality
public boolean forEachLongRangeInSpanWithOffsetAndMaxCardinality(int i, long offset, long maxCardinality, LongRangeAbortableConsumer larc) -
forEachLongRange
-
subrangeByKeyInternal
-
subrangeByPosInternal
-
removeRangeUnsafeNoWriteCheck
public void removeRangeUnsafeNoWriteCheck(long start, long end) -
removeRangesUnsafeNoWriteCheck
-
uMax
public static long uMax(long k1, long k2) -
uMin
public static long uMin(long k1, long k2) -
uLess
public static boolean uLess(long k1, long k2) -
uLessOrEqual
public static boolean uLessOrEqual(long k1, long k2) -
uGreater
public static boolean uGreater(long k1, long k2) -
uGreaterOrEqual
public static boolean uGreaterOrEqual(long k1, long k2) -
toString
-
getOrderedKeysByPosition
-
getOrderedKeysByKeyRange
-
asOrderedKeys
-
getOrderedKeysIterator
-
getAverageRunLengthEstimate
public long getAverageRunLengthEstimate() -
getAverageRunLengthEstimate
public long getAverageRunLengthEstimate(int startIdx, int endIdx) -
rangesCountUpperBound
public long rangesCountUpperBound() -
rangesCountUpperBound
public long rangesCountUpperBound(int startIdx, int endIdx) -
valuesToString
-
containerOverhead
public double containerOverhead() -
sampleMetrics
public void sampleMetrics(LongConsumer rspParallelArraysSizeUsed, LongConsumer rspParallelArraysSizeUnused, LongConsumer arrayContainersBytesAllocated, LongConsumer arrayContainersBytesUnused, LongConsumer arrayContainersCardinality, LongConsumer arrayContainersCount, LongConsumer bitmapContainersBytesAllocated, LongConsumer bitmapContainersBytesUnused, LongConsumer bitmapContainersCardinality, LongConsumer bitmapContainersCount, LongConsumer runContainersBytesAllocated, LongConsumer runContainersBytesUnused, LongConsumer runContainersCardinality, LongConsumer runContainersCount, LongConsumer runContainersRunsCount, LongConsumer singleRangeContainersCount, LongConsumer singleRangeContainerCardinality, LongConsumer singletonContainersCount, LongConsumer twoValuesContainerCount, LongConsumer rpsFullBlockSpansCount, LongConsumer rspFullBlockSpanLen) -
tryCompact