Preemptive Tables
Overview
Deephaven provides the ability to share tables between queries. Table sharing is typically accomplished via viewport tables or Preemptive Tables.
Viewport tables provide a consistent view of a portion of a table. The Deephaven GUI uses viewport tables when displaying tables in GUI panels. Only the portion of the table being "viewed" is delivered to the GUI. This allows users to manipulate large tables — tens of millions of rows — directly within the Deephaven user interface, with minimal use of CPU, memory and network bandwidth on their computers.
With Preemptive Tables, the query processor automatically pushes a consistent snapshot of all data from a table on the server to subscribed clients at regular intervals. Preemptive tables are typically used when:
- One query performs expensive data analysis and shares the results with other queries.
For example, an existing query may already use extensive computing power and time to analyze a large dataset, which subsequently creates a much smaller table containing the results of that analysis. By sharing this small result table, other queries and other users can use the result without needing to rerun the original (and expensive) query multiple times for each additional query and/or user. - Users do not have permissions to access raw data but do have permissions to access results derived from the raw data.
For example, a trading firm may decide that competing trading groups are not allowed to see each other's positions, but all trading groups can see the aggregate positions for the firm. A query could generate a Preemptive Table containing the aggregate positions for the firm and share it with all trading groups.
During a Preemptive Table refresh, the entire table is sent over the network to subscribed clients. Therefore, care should be taken to ensure (1) the table size will not overwhelm clients during initial connection, and (2) the table's update frequency will not cause network congestion. Although Preemptive Tables solve some important problems, they should be employed judiciously, and data size and update frequency must be carefully considered.
Publishing a Preemptive Table
Creating and publishing a Preemptive Table for sharing is very easy. Consider the example persistent query below, which is named ExamplePreemptivePublisher
and is managed by a user named Mike. The query computes a result table named result
and then creates and publishes a preemptive version of the result
table named resultPre
. In this case, the preemptive resultPre
table refreshes every five seconds (5000 ms).
from deephaven import *
# perform some calculations to produce a result table
result = db.timeTable("00:00:03")
# create and publish a preemptive version of the result table
# which refreshes every 5 seconds
resultPre = result.preemptiveUpdatesTable(5000)
// perform some calculations to produce a result table
result = db.timeTable("00:00:03")
// create and publish a preemptive version of the result table
// which refreshes every 5 seconds
resultPre = result.preemptiveUpdatesTable(5000)
To create a Preemptive Table, the preemptiveUpdatesTable
method is called on the source table. The only argument is the refresh interval, specified in milliseconds. This value determines how often the query will push data changes to connected clients. The refresh interval should always be greater than or equal to 1,000 milliseconds.
Subscribing to a Preemptive Table
To subscribe to a Preemptive Table, (1) a query must exist that publishes the Preemptive Table, and (2) you must have sufficient permissions to access the Preemptive Table.
The example below assumes the query from the prior section is being run as a persistent query owned by Mike and named ExamplePreemptivePublisher
. (The query owner and query name can be found in the Query Config window.) In the example, the query first connects to Mike's ExamplePreemptivePublisher
query. Once the connection is established, the resultPre
table from the query is retrieved as t
, which can now be used like any other table.
from deephaven import *
# connect to Mike's ExamplePreemptivePublisher query
client = PersistentQueryTableHelper.getClientForPersistentQuery(3*60*1000, owner="Mike", name="ExamplePreemptivePublisher")
# get the "resultPre" table from Mike's ExamplePreemptivePublisher query
t = PersistentQueryTableHelper.getPreemptiveTableFromPersistentQuery("resultPre", helper=client)
# use the table to perform calculations
t2 = t.where("i>10")
# connect to the ExamplePreemptivePublisher query by using its serial number
client = PersistentQueryTableHelper.getClientForPersistentQuery(3*60*1000, configSerial=1502904477776000000)
import com.illumon.iris.controller.utils.PersistentQueryTableHelper
// connect to Mike's ExamplePreemptivePublisher query
client = PersistentQueryTableHelper.getClientForPersistentQuery(log, "Mike", "ExamplePreemptivePublisher", 3*60*1000)
// get the "resultPre" table from Mike's ExamplePreemptivePublisher query
t = PersistentQueryTableHelper.getPreemptiveTableFromPersistentQuery(client, "resultPre")
// use the table to perform calculations
t2 = t.where("i>10")
// connect to the ExamplePreemptivePublisher query by using its serial number
client = PersistentQueryTableHelper.getClientForPersistentQuery(log, 1502904477776000000, 3*60*1000)
In the example, the connection will timeout and the query will fail to run if a connection to ExamplePreemptivePublisher
cannot be established within three minutes.
In addition to connecting to a query using the query owner and query name, it is also possible to connect to a query by using the query's unique serial number, which is also displayed in the Query Config window. As shown in the following example client script, the query's serial number (highlighted below) replaces the query author's name and the query name:
// connect to the ExamplePreemptivePublisher query by using its serial number
client = PersistentQueryTableHelper.getClientForPersistentQuery(log, 1502904477776000000, 3*60*1000)
Note: These examples can be run in your own instance of Deephaven. However, you will need to change the respective user name, query name and/or the query's serial number as needed based on how you configured the persistent queries.
Last Updated: 25 February 2020 08:26 -05:00 UTC Deephaven v.1.20200121 (See other versions)
Deephaven Documentation Copyright 2016-2020 Deephaven Data Labs, LLC All Rights Reserved