Advanced Table Methods
TableTools
Class com.illumon.iris.db.tables.utils.TableTools contains many methods for printing, loading, saving and modifying tables. All methods of TableTools are statically imported into each IrisDbGroovySession by default. Some of those functions follow:
Viewing data
| 
 | Prints the first few rows of a table to standard output. | 
| 
 | Prints the first few rows of a table to standard output, and also shows the details of the index used to retrieve the data. | 
| 
 | Returns the first few rows of a table as a pipe-delimited string. | 
| 
 | Returns an entire table formatted as HTML. Note that this should only be used on small tables, as the whole string is built in memory. | 
CSV files
| 
 | Loads a CSV file as an in-memory DB table. Column types are inferred from the data. | 
| 
 | Writes a DB table out as a CSV file. | 
Binary log files
| 
 | Loads a binary log file as an in-memory DB table. Requires access to the listener class that matches the file to be loaded. | 
Manually creating tables and columns
| 
 | Creates an empty table with size rows. | 
| 
 | Creates an empty table with size rows, and column names and types from  | 
| 
 | Creates a new table with the specified columns. | 
| 
 | Creates a new data column. | 
Manually creating tables and columns
Users may want to create columns for a new table out of arrays. The syntax is as follows:
from deephaven import *  # deephaven.TableTools module imported as as ttools
myTable = ttools.newTable(
    ttools.col("StringColumn", "Str1", "Str2", "Str3", "Str4"),
    ttools.intCol("IntegerColumn", 1, 2, 4, 8),
    ttools.doubleCol("Decimals", 1.0, 2.0, 4.0, 8.0),
    ttools.charCol("Characters", ["A", "B", "C", "D"]))
myStrings = ["Str1", "Str2", "Str3", "Str4"] as String[]
myInts = [1, 2, 4, 8] as int[]
myDoubles = [1.0, 2.0, 4.0, 8.0] as double[]
myCharacters = ['A', 'B', 'C', 'D'] as char[]
myTable = newTable(
        col("StringColumn", myStrings),
        intCol("IntegerColumn", myInts),
        doubleCol("Decimals", myDoubles),
        charCol("Characters" , myCharacters)
)
Each line of the query specifies the data type for the column, its name, and the data for that column in the form of an array. In the Groovy script, each array is first assigned to a variable, which is then included in the query that generates the new table.
More
| 
 | Combines the tables into an aggregate table. The resulting table comprises the rows of each of the Input Tables, in the given order of the inputs. Null inputs are skipped. | 
| 
 | Computes the difference of two tables for use in verification. | 
TableManagementTools
Class com.illumon.iris.db.tables.utils.TableManagementTools contains many methods for managing tables. The TableManagementTools class is not imported by default, and would need to be manually imported into the IrisDbGroovySession to access its methods. Some of those functions follow:
Simple binary files
| 
 | Loads a DB binary table. | 
| 
 | Saves a table to disk in DB binary format. | 
| 
 | Deletes a DB binary format table from disk. | 
Directory
| 
 | Gets all of the directories for an on-disk database. | 
Add column(s)
| 
 | Adds a new column to a table definition. | 
Tables Derived from Functions
There are two factories that allow you to create a table from the result of a function:
- The HashSetBackedTableFactorycreates a table with one row for each SmartKey in a hash set (order is not preserved).
- The FunctionGeneratedTableFactoryallows you to assemble a table in your function, and the contents of the output table are replaced with the new values.
HashSetBackedTableFactory
An abstract table that represents a HashSet of SmartKeys. Since we are representing a set, we are not defining an order to our output. Whatever order the table happens to end up in, is fine. The table will refresh by regenerating the full HashSet (using the setGenerator function passed in); and then comparing that to the existing HashSet.
/**
 * Create a ticking table based on a setGenerator.
 *
 * @param setGenerator a function that returns a HashSet of SmartKeys; each SmartKey is a row in the output.
 * @param refreshIntervalMs how often to refresh the table, if <= 0 the table does not tick.
 * @param colNames the column names for the output table, must match the number of elements in each SmartKey.
 * @return a table representing the Set returned by the setGenerator
 */
 com.illumon.iris.db.v2.utils.HashSetBackedTableFactory.create(Function.Nullary<HashSet<SmartKey>> setGenerator, int refreshIntervalMs, String... colNames)
FunctionGeneratedTableFactory
This factory allows you to create an arbitrary table using operations like pulling data out of another table and manipulating it using Java code. The table will refresh by regenerating the full values (using the tableGenerator function passed in). The resultant table's values are copied into the result table and appropriate listener notifications are fired.
All of the rows in the output table are modified on every tick, even if no actual changes occurred. The output table also has a contiguous index.
The generator function must produce a table, and the table definition must not change between invocations.
If you are transforming a table, you should generally use the regular table operations as opposed to this factory because table operations are capable of performing some operations incrementally. However, for small tables, this might prove to require less development effort.
There are two creation methods, and each has a different refresh policy. One is time-based and the other is based on a set of ticking tables:
Time-based:
public static Table create(Function.Nullary<Table> tableGenerator, int refreshIntervalMs);
Ticking tables-based:
public static Table create(Function.Nullary<Table> tableGenerator, DynamicTable ...sourceTables);
The query engine does not know the details of your function inputs. If you are dependent on a ticking table in your tableGenerator function, you can add it to the sourceTables list so the function will be recomputed on each tick. Note: If you have an expensive computation, keep in mind it will be recomputed -- from scratch -- each time any of the tables listed in sourceTables tick, even if your result would not change.
An appropriate use of these methods would be to reformat data that you've used other table operations to create, without the need to cast things like transposition into a table operation.
Array Access
The following methods provide array access to columns in a table, and create a row index to locate specific values.
A column name followed by an underscore represents a column as an array of all values within the column. For example, Sym_ creates an array of all the column's Symbols:
t.update("SA=Sym_")
However, you may want to narrow your results and ask for the element in a certain row. In that case, you should create a row index for your table:
t.update("Index=i")// a positional row index, as an Integer
t.update("Index=ii") // a positional row index, as a Long
The variables i and ii both represent row numbers. However, Integers are limited to values up to 231-1, while Longs can represent values up to 263-1. In other words, to avoid precision problems you should use the ii variable; unless you specifically need an Integer to pass to another function.  Using the i variable in a table with more than 2 billion rows will result in an error.
The row index allows users to pinpoint certain values within an array. Within a query, square bracket operators get an element from the array based on the row index. For example, Sym_ creates an array, and Sym_[ii-1] uses the index to go one row back, or get the Symbol value in the previous row.  Sym_[ii+3] creates an array, and then gets the Symbol value three rows ahead of the current row:
t.update("X=Sym_[ii-1]")
t.update("Y=Sym_[ii+3]")
Note: these methods are unreliable within a ticking table. When new rows are inserted within the table (as a result of multiple internal partitions), previously created row indexes do not automatically update, which will cause inconsistent results.
A third variable, k, provides a unique reference for a particular row in the table:
t.update("Index=k")// the internal index within the table
This method is not used for array access, and should only be used in limited circumstances such as debugging or advanced query operations. To use this method properly, you must know the index key of a row. The internal index is not stable between operations and values are not necessarily contiguous. As with the i and ii row indexes, the k index is unreliable for live or intraday data due to query engine operations.
Let's take a look at an example query that combines these methods:
t=t.update("Difference = (Sym=Sym_[ii-1]) ? Price-Price_[ii-1] : null")
This asks for the difference in price in Symbol values using a ternary operator, which returns results based upon a true or false comparison. We want to compare the same Symbol, so the first clause in the formula checks if the value of the Sym in the current row equals the value in the previous row. Then, we want to determine the current price minus the price in the previous row. If the Symbol is not the same in that row (a false comparison), we get a "null" value. If the comparison is true, the formula will return the difference in price. 
Last Updated: 25 February 2020 08:26 -05:00 UTC Deephaven v.1.20200121 (See other versions)
Deephaven Documentation Copyright 2016-2020 Deephaven Data Labs, LLC All Rights Reserved