More To Know
Table Tricks
In addition to the code snippets included below, all of the Groovy and Python code for the recipes below can be downloaded as premade Deephaven Notebooks. To learn more about using Deephaven Notebooks, please refer to the Notebook section of the User Manual.
Download the Query Cookbook Notebooks
This section demonstrates a variety of methods related to creating, manipulating and gathering information about tables.
Step 1: Pull trades from a certain date and remove some columns and rows.
trades = db.t("LearnDeephaven", "StockTrades").where("Date=`2017-08-25`")
.view("Date", "ExchangeTimestamp", "Sym", "Last")
.head(10)
Step 2: Create a custom row.
The following query creates an empty table of size 1 (i.e., a table with one empty row) and populates it with the same columns and column value types as those found in the trades
table.
time = convertDateTime("2017-08-25T04:05:00 NY")
custom = emptyTable(1).updateView("Date=`2017-08-25`", "ExchangeTimestamp=time", "Sym=`AAPL`", "Last=160.0")
Step 3: Add the custom row to the bottom of the trades table.
tradesNewRow = merge(trades, custom)
Step 4: Show the invisible index column.
Deephaven tables have an invisible column, i
, which holds the position for each row. Let's make it visible.
indexes = tradesNewRow.updateView("Index=i")
Step 5: Utilize a column as an array of values by placing an underscore after column name.
The following query will create a new column showing the previous price for every trade:
prevPrice = indexes.updateView("PreviousPrice=Last_[i-1]")
Last_
allows you to access column values as if it's an array. The brackets allow access to a column value of your choosing. i - 1
is the previous index; remember that i
represents current row position. Note that the first trade has no previous price because there are no previous trades before that time.
Step 6: Using the new column, calculate the difference between current and previous price.
difference = prevPrice.updateView("Difference=Last-PreviousPrice")
Step 7: Show the table's metadata.
When integrating custom code into Deephaven queries, it may be useful to look at the table's metadata, which contains information about the table and the data it holds.
meta = difference.getMeta()
Step 8: Finally, access a value from a single cell in the table.
For this example, the value in the 6th row (index value 5) from the Difference
column will be shown in the Log
panel.
cellValue = difference.getColumn("Difference").get(5)
println cellValue
Complete Code for the Table Tricks Recipe
trades = db.t("LearnDeephaven", "StockTrades").where("Date=`2017-08-25`")
.view("Date", "ExchangeTimestamp", "Sym", "Last")
.head(10)
time = convertDateTime("2017-08-25T04:05:00 NY")
custom = emptyTable(1).updateView("Date=`2017-08-25`", "ExchangeTimestamp=time", "Sym=`AAPL`", "Last=160.0")
tradesNewRow = merge(trades, custom)
indexes = tradesNewRow.updateView("Index=i")
prevPrice = indexes.updateView("PreviousPrice=Last_[i-1]")
difference = prevPrice.updateView("Difference=Last-PreviousPrice")
meta = difference.getMeta()
cellValue = difference.getColumn("Difference").get(5)
println cellValue
# Initial import
from deephaven import *
print("Provides:\n"
"\tdeephaven.Calendars as cals\n"
"\tdeephaven.ComboAggregateFactory as caf\n"
"\tdeephaven.DBTimeUtils as dbtu\n"
"\tdeephaven.Plot as plt\n"
"\tdeephaven.Plot.figure_wrapper as figw\n"
"\tdeephaven.QueryScope as qs\n"
"\tdeephaven.TableManagementTools as tmt\n"
"\tdeephaven.TableTools as ttools\n"
"See print(sorted(dir())) for the full namespace contents.")
# Table Tricks
trades = db.t("LearnDeephaven", "StockTrades").where("Date=`2017-08-25`")\
.view("Date", "ExchangeTimestamp", "Sym", "Last")\
.head(10)
time = dbtu.convertDateTime("2017-08-25T04:05:00 NY")
custom = ttools.emptyTable(1).updateView("Date=`2017-08-25`", "ExchangeTimestamp=time", "Sym=`AAPL`", "Last=160.0")
tradesNewRow = ttools.merge(trades, custom)
indexes = tradesNewRow.updateView("Index=i")
prevPrice = indexes.updateView("PreviousPrice=Last_[i-1]")
difference = prevPrice.updateView("Difference=Last-PreviousPrice")
meta = difference.getMeta()
cellValue = difference.getColumn("Difference").get(5)
print (cellValue)
Bad Math
Be careful of invalid math operations like dividing by zero.
Step 1: Pull trades from a certain date and remove some columns.
trades = db.t("LearnDeephaven", "StockTrades").where("Date=`2017-08-25`")
.view("ExchangeTimestamp", "Sym", "Last", "Size")
Step 2: Calculate the dollars involved in each trade.
dollarsInvolved = trades.updateView("DollarsInvolved=Last*Size")
Step 3: Calculate the total dollars involved, total shares traded, and the last price for each symbol.
summary = dollarsInvolved.by(AggCombo(AggSum("TotalSize=Size", "TotalDollars=DollarsInvolved"), AggLast("Last")), "Sym")
In some cases, some rows may have a zero value for the size. While this dataset does not, we can add such a row to demonstrate challenges when working with zero values.
Step 4: Create a table with one row, with the same column names, to represent the SPX index, including a zero value for size. Then merge the new table to previous table.
spx = emptyTable(1).updateView("Sym=`SPX`", "TotalSize=(long)0", "TotalDollars=0.0", "Last=2465.84")
withSpx = merge(summary, spx)
Step 5: Calculate the volume weighted average price (VWAP).
vwap = withSpx.updateView("VWAP=TotalDollars/TotalSize")
As shown, an empty cell was added to the SPX
row. This is because there was division by a zero value. It is important to note that even though cells may appear empty, they always contain either a null or NaN value. You always want to avoid NaN values. The empty cell inthe vwap
table contains a NaN value.
Step 6: To demonstrate, perform a sum aggregation on VWAP
.
sum = vwap.view("VWAP").sumBy()
Eleven cells have been aggregated, but because the empty cell contains a NaN value, the other ten values were discarded. This would not have happened had the cell contained a null value. Null values will not impact aggregation calculations. However, because NaN values are the result of invalid math operations, they will poison your calculations.
Step 7: Calculate VWAP again, except this time take care when dividing by zero.
This time, we are using a ternary operator to ensure a zero value is not used as the denominator in the calculation. In essence, the if/then logic of the ternary operator in this query means: "if TotalSize
is not zero, then VWAP
will be TotalDollars
divided by TotalSize
; otherwise, VWAP
will be null".
vwapIfThen = withSpx.updateView("VWAP =(TotalSize != 0) ? (TotalDollars/TotalSize) : NULL_DOUBLE")
An empty cell remains present in the table. However, we used the ternary operator to ensure the value was null and not NaN.
Step 8: Rerun the sum aggregation on VWAP.
sumIfThen = vwapIfThen.view("VWAP").sumBy()
This time the aggregation succeeded as intended. Note: You can also determine if your empty cells contain null values or not by using isNull
. See below:
Step 9: Check for null values.
checkNull = vwapIfThen.updateView("NullOrNot=isNull(VWAP)")
Complete Code Blocks for the Bad Math Recipe
trades = db.t("LearnDeephaven", "StockTrades").where("Date=`2017-08-25`")
.view("Date", "ExchangeTimestamp", "Sym", "Last", "Size")
dollarsInvolved = trades.updateView("DollarsInvolved=Last*Size")
summary = dollarsInvolved.by(AggCombo(AggSum("TotalSize=Size", "TotalDollars=DollarsInvolved"), AggLast("Last")), "Sym")
spx = emptyTable(1).updateView("Sym=`SPX`", "TotalSize=(long)0", "TotalDollars=0.0", "Last=2465.84")
withSpx = merge(summary, spx)
vwap = withSpx.updateView("VWAP=TotalDollars/TotalSize")
sum = vwap.view("VWAP").sumBy()
vwapIfThen = withSpx.updateView("VWAP =(TotalSize != 0) ? (TotalDollars/TotalSize) : NULL_DOUBLE")
sumIfThen = vwapIfThen.view("VWAP").sumBy()
checkNull = vwapIfThen.updateView("NullOrNot=isNull(VWAP)")
# Initial import
from deephaven import *
# Bad Math
trades = db.t("LearnDeephaven", "StockTrades").where("Date=`2017-08-25`")\
.view("Date", "ExchangeTimestamp", "Sym", "Last", "Size")
dollarsInvolved = trades.updateView("DollarsInvolved=Last*Size")
summary = dollarsInvolved.by(caf.AggCombo(caf.AggSum("TotalSize=Size", "TotalDollars=DollarsInvolved"), caf.AggLast("Last")), "Sym")
spx = ttools.emptyTable(1).updateView("Sym=`SPX`", "TotalSize=(long)0", "TotalDollars=0.0", "Last=2465.84")
withSpx = ttools.merge(summary, spx)
vwap = withSpx.updateView("VWAP=TotalDollars/TotalSize")
sum = vwap.view("VWAP").sumBy()
vwapIfThen = withSpx.updateView("VWAP = (TotalSize != 0) ? (TotalDollars/TotalSize) : NULL_DOUBLE")
sumIfThen = vwapIfThen.view("VWAP").sumBy()
checkNull = vwapIfThen.updateView("NullOrNot=isNull(VWAP)")
Custom Functionality
Deephaven enables users to write and implement custom functions for use with queries. In this recipe, a function is created to calculate rolling sums for the number of shares traded at each row.
Step 1: Pull trades from a certain date and remove some columns and rows.
trades = db.t("LearnDeephaven", "StockTrades").where("Date=`2017-08-25`")
.view("Sym", "Size")
.headBy(100, "Sym")
Step 2: Before we write our function, group together each symbol's values.
symGroups = trades.by("Sym")
Ultimately, we're going to add an array to this table containing a rolling sum for each trade.
Step 3: Create the function.
Our function will be called rollingSum
and it will need a number of rows
and an array of values
.
rollingSum = { rows, values ->
// Create a new array that will store rolling sum calculations
calculations = new int[values.size()]
// Create a running sum
sum = 0
//Iterate through each value in the array
for (int i = 0; i < values.size(); ++i)
{
// Add the current value to the running sum
sum += values.get(i)
// Subtract the outdated value
if (i >= rows) sum -= values.get(i - rows)
// Store the rolling sum upon each iteration
calculations[i] = sum
}
// Return the array of rolling sums
return calculations
}
Step 4: Use the new function to add a new column.
Note: int[]
makes sure that the resulting array is full of integers.
sums = symGroups.updateView("RollingSum=(int[]) rollingSum.call(20, Size)")
Step 5: Distribute the arrays into their own rows.
ungroup = sums.ungroup()
As you can see, every row for every symbol has its own up-to-date rolling sum.
Complete Code Blocks for the Custom Functionality Recipe
trades = db.t("LearnDeephaven", "StockTrades").where("Date=`2017-08-25`")
.view("Sym", "Size")
.headBy(100, "Sym")
symGroups = trades.by("Sym")
rollingSum = { rows, values ->
// Create a new array that will store rolling sum calculations
calculations = new int[values.size()]
// Create a running sum
sum = 0
//Iterate through each value in the array
for (int i = 0; i < values.size(); ++i)
{
// Add the current value to the running sum
sum += values.get(i)
// Subtract the outdated value
if (i >= rows) sum -= values.get(i - rows)
// Store the rolling sum upon each iteration
calculations[i] = sum
}
// Return the array of rolling sums
return calculations
}
sums = symGroups.updateView("RollingSum=(int[]) rollingSum.call(20, Size)")
ungroup = sums.ungroup()
# Initial import
from deephaven import *
# Custom Functionality
trades = db.t("LearnDeephaven", "StockTrades").where("Date=`2017-08-25`")\
.view("Sym", "Size")\
.headBy(100, "Sym")
symGroups = trades.by("Sym")
def rollingSum(rows, values):
calculations = jpy.array('int', values.size()) # create an array of integers
sum = 0 # our running sum
for i in range(values.size()):
sum += values.get(i)
if (i >= rows): sum -= values.get(i - rows) # subtract when needed
calculations[i] = sum # store a rolling sum for every value
return calculations # return an array of rolling sum calculations
sums = symGroups.updateView("RollingSum=(int[]) rollingSum.call(20, Size)")
ungroup = sums.ungroup()
Last Updated: 28 February 2020 12:20 -05:00 UTC Deephaven v.1.20200121 (See other versions)
Deephaven Documentation Copyright 2016-2020 Deephaven Data Labs, LLC All Rights Reserved