Managing Data in User Tables
Overview
When using Deephaven, you may want to store the results of a query for later use. Deephaven supports saving data both within the database and in external formats. Deephaven allows users to store their own tables in the database. This section of the documentation discusses managing data in user tables.
See also: Table Storage
Unlike system tables, user tables do not require a schema file. User tables can be stored in any namespace that does not contain system tables. Namespaces do not need to be configured in advance; they are created automatically when a query creates a user table. Once data is stored in a user table, all users with the appropriate permissions can access it.
There are two types of user tables:
- Directly manipulated user tables, stored on a locally accessible filesystem. These may be non-partitioned or partitioned (see below).
- Centrally managed user tables, written by the Remote Table Appender, are always partitioned.
User tables can be added, replaced, or deleted at any time.
All user table definitions exist in /db/Users. However, each section of this documentation (Non-Partitioned Direct User Tables, Partitioned Direct User Tables, and Centrally Managed User Tables represent distinct APIs and these methods should not be mixed.
Partitions
Partitions are distinct segments of data that can be managed independently.
- A non-partitioned, or splayed table, is not divided into individual sections: the table exists as a whole.
- With a partitioned table, one column is designated as the "partitioning column." All rows with the same value in this column are placed in the same partition. Each partition constitutes an independent section of the table. Individual partitions can be added, replaced, or removed without altering data from other partitions. The most common partitioning column is Date. Partitioning tables by date simplifies the management of tables containing data over long time periods.
- To add a partitioned user table to the database, you must first add the table definition. The table definition can be taken from another table, such as the table whose data will be added to the new user table.
Non-Partitioned Direct User Tables
Several methods are used to manage content within directly manipulated, non-partitioned user tables.
Adding a Table
The addTable method is used to create a new, non-partitioned direct user table in a namespace, and simultaneously writes the table and its definition. Arguments to the method include the namespace, the new table name and the variable from which the data is copied. The following example copies data from queryResults and then uses that data to create a new non-partitioned user table called NewTableName in the ExampleUserNamespace:
db.addTable("ExampleUserNamespace", "NewTableName", queryResults)
Note: The operation will fail if the definition already exists in an existing table named NewTableName.
Appending to a Table
The appendTable method is used to append data onto an existing non-partitioned direct user table. The following example copies data from queryResults and then appends the content to an existing user table called ExampleTable in the ExampleUserNamespace:
db.appendTable("ExampleUserNamespace", "ExampleTable", queryResults)
Replacing a Table
The replaceTable method is used to replace (not append or modify) data in an existing non-partitioned direct user table. Arguments to the method include the namespace, the name of the table containing the data to be replaced, and the variable from which the new data is copied. The following example copies data from queryResults and then replaces the content in an existing user table called ExampleTable in the ExampleUserNamespace:
db.replaceTable("ExampleUserNamespace", "ExampleTable", queryResults)
Note: If no table exists with the namespace and table name provided, replaceTable()behaves identically to addTable().
Removing a Table
The removeTable method removes an existing non-partitioned user table and its definition from the specified namespace. The following example removes an existing user table called ExampleTable from the ExampleUserNamespace:
db.removeTable("ExampleUserNamespace", "ExampleTable")
Other Methods
deleteTables will find all table locations and delete them. This may be necessary if any data is corrupt. 
deleteMatchingTables will delete all tables with names matching the regular expression. This example regex matches tables beginning with "Test":
 db.deleteMatchingTables("ExampleUserNamespace" , "Test.*")
deleteNamespace will delete the entire namespace and all tables and data therein, and should be used with caution. Do not use if there are centrally managed user tables in the namespace.
Partitioned Direct User Tables
To add a directly manipulated, partitioned user table to the database, you must designate a partition and create the table definition. The table definition can be taken from another table, such as the table whose data will be added to the new user table.
The examples below demonstrate creating and modifying a partitioned table based on data stored as the queryResult variable. The table will be partitioned by date.
Adding a Table Definition for a Partitioned Direct User Table
The addPartitionedTableDefinition method is used create a partitioned table definition in the database. The arguments provided to the method are:
- namespace: The namespace under which the table is stored.
- tableName: The name used to save the table.
- partitioningColumnName: The column by which the data will be partitioned.
- tableDefinition: A table definition, which specifies the columns to be saved. This can be taken from an existing table with the- getDefinition()method.
The code below demonstrates the creation of a table called MyQueryResults in a namespace called ExampleUserNamespace. The table will be partitioned by the Date column, and will use the table definition (thus, the columnset) of the table stored in the queryResults variable:
addPartitionedTableDefinition("ExampleUserNamespace", "MyQueryResults", "Date", queryResults.getDefinition())
Using addTablePartitionAndDefinition() is identical to calling addPartitionedTableDefinitionIfAbsent() immediately before calling addTablePartition().
Storing Data in a Partitioned Direct User Table
To store data from queryResults to the May 1, 2017 partition of the MyQueryResults table, use the addTablePartition method:
db.addTablePartition("ExampleUserNamespace", "MyQueryResults", "2017-05-01", queryResults)
The operation will fail if data already exists in the 2017-05-01 partition of the MyQueryResults table. It will also fail if the table definition MyQueryResults table has not been added yet.
It is possible to automatically add the table definition when storing the data. This can be accomplished with the addTablePartitionAndDefinition method:
db.addColumnPartitionAndDefinition ("ExampleUserNamespace", "MyQueryResults", "2017-05-01", queryResults)
Replacing Data in a Partitioned Direct User Table
To replace the existing data stored under the May 1, 2017 partition of the MyQueryResults, use the replaceTablePartition method:
db.replaceTablePartition("ExampleUserNamespace", "MyQueryResults", "2017-05-01" queryResults)
If no data exists at the partition name provided, replaceTablePartition() behaves identically to addTablePartition().
The replaceTableDefinition may also be used. 
Removing Data from a Partitioned Direct User Table
To remove  the May 5, 2017 partition of the MyQueryResults table, use the removeTablePartition method:
db.removeTablePartition("ExampleUserNamespace", "MyQueryResults", "2017-05-01")
The following can be used to remove the entire table:
db.removeTable("ExampleUserNamespace" , "MyQueryResults").
This removes both the table and its definition.
Centrally Managed User Tables
These tables are always partitioned and are written by the Remote Table Appender. Although the methods for specifying definitions are the same as with directly managed user tables, there is no other overlap.
Adding a Table Definition for a Centrally Managed User Table
To create a centrally managed user table, the definition must exist first using the addPartitionedTableDefinition method. The table data is appended to a system-defined internal partition, and the specified column partition value. The arguments provided to the method are:
- namespace: The namespace under which the table is stored.
- tableName: The name used to save the table.
- partitioningColumnName: The column by which the data will be partitioned (e.g., "2019-01-11" for a Date partitioning column).
- tableDefinition: A table definition, which specifies the columns to be saved. This can be taken from an existing table with the- getDefinition()method.
The code below demonstrates the creation of a table called ExampleQueryResults in a namespace called ExampleUserNamespace. The table will be partitioned by the Date column, and will use the table definition (thus, the columnset) of the table stored in the queryResults2 variable:
addPartitionedTableDefinition("ExampleUserNamespace", "ExampleQueryResults", "Date", queryResults2.getDefinition())
Storing Data in a Centrally Managed User Table
To add rows to a centrally managed user table from a table object (i.e., a query table or a memory table),  use the appendCentral method. 
The method takes the following arguments:
- namespace: The namespace under which the table is stored.
- tableName: The name used to save the table.
- 
                                            columnPartitionValue: The column partition value (e.g., "2019-01-11" for a Date partitioning column).
- newData: The table containing the new data to be appended.
- flush(boolean):- trueto flush the table to disk;- falseto wait for flushing (if the implementation supports waiting).
The code below will append rows onto to the partitioned user table, ExampleQueryResults, created in the previous example: 
db.appendCentral("ExampleUserNamespace","ExampleQueryResults", "2018-08-27", queryResults3, true)
In this example, rows from a different table, queryResults3, from the specified date will be added to ExampleQueryResults. The final argument, true, flushes data to the disk immediately. The appendCentral method is useful for persisting data out to disk that is generated in a user script. 
The iris.defaultColumnPartitionNamingFunction property determines the default partitioning column for all user tables. 
To use the default partitioning column, the appendCentral method takes the following arguments:
- namespace: The namespace under which the table is stored.
- tableName: The name used to save the table.
- newData: The table containing the new data to be appended.
- flush(boolean):- trueto flush the table to disk;- falseto wait for flushing.
Replacing Data in a Centrally Managed User Table
Unlike directly manipulated user tables, there are no distinct methods to replace data. You can delete the data (see below) and append new data to the table. There is no database call to delete a single centrally managed partition. You can create a Remote Table Appender (RTA) and use it directly.
Removing Data from a Centrally Managed User Table
To remove the all of the data from the ExampleQueryResults table, use the closeAndDeleteCentral method:
db.closeAndDeleteCentral("ExampleUserNamespace", "ExampleQueryResults")
To delete the table definition itself, use the removeTableDefiniton method:
db.removeTableDefinition("ExampleUserNamespace", "ExampleQueryResults")
Other Methods
The logPartitionTableSnapshot method snapshots a table at a regular interval and writes the snapshot to a partitioned table. The arguments to the method follow:
- namespace: The namespace under which the table is stored.
- tableName: The name used to save the table.
- partitioningColumnName: The column by which the data will be partitioned.
- columnPartitionValue: The column partition value (e.g., "2019-02-11" for a Date partitioning column).
- table: The table to snapshot and log (does not contain the partition column).
- logInterval: The interval, in milliseconds, to log snapshots at.
- replayer: The data replayer.
- append(boolean):- trueto append to an existing partition;- falseto replace the existing partition.
The logPartitionIncremental method writes all changes in an input table to a linked partitioned table.  The arguments to the method follow:
- namespace: The namespace under which the table is stored.
- tableName: The name used to save the table.
- partitioningColumnName: The column by which the data will be partitioned.
- columnPartitionValue: The column partition value (e.g., "2019-02-11" for a Date partitioning column).
- table: The table to log (does not contain the partition column).
- append(boolean):- trueto append to an existing partition;- falseto replace the existing partition.
Last Updated: 23 September 2019 12:17 -04:00 UTC Deephaven v.1.20181212 (See other versions)
Deephaven Documentation Copyright 2016-2019 Deephaven Data Labs, LLC All Rights Reserved