When should I use whereIn
or naturalJoin
?
A good rule of thumb is to use whereIn
when your list of values (your right table) stays relatively stable. The whereIn
call maintains a minimal amount of state. The set of unique values in the right table is maintained in memory. Only the Index of rows that currently match the whereIn
filter is maintained for the left table. When the valid set changes, the columns that you are using to filter the left table must be scanned. When the right table changes infrequently, this is a useful trade-off. The left table is seldom scanned, and less memory and computation is used on each update.
naturalJoin
, on the other hand, must maintain a state entry for all values in both the left and right tables. This includes all keys that currently pass the filter, but also all keys that do not pass the filter. For each key, an Index data structure is maintained for the left and right side. When a value enters the right table, all of the matching left hand side rows are available and can be augmented with the corresponding right hand side row. When a value leaves the right table, all of the matching left hand side rows are available and the augmentation can be removed. This means that processing each individual left hand side update is more expensive, but the left hand table need not be scanned when the right hand side table changes.
Traders or firms often have a fairly stable set of symbols that they have a position in. Using a whereIn
clause to filter other tables to those positions is generally efficient. The query need not maintain excess state for the rare case in which the valid symbols change. However, when the positions do change, full table scans are required.
On the other hand, the set of open order IDs changes frequently. In this case, filtering a table by open order ID using whereIn
would require frequent scans. A naturalJoin
would require more in-memory state, but would be able to compute the results more incrementally.
Last Updated: 16 February 2021 18:06 -04:00 UTC Deephaven v.1.20200928 (See other versions)
Deephaven Documentation Copyright 2016-2020 Deephaven Data Labs, LLC All Rights Reserved