More To Know

Table Tricks

In addition to the code snippets included below, all of the Groovy and Python code for the recipes below can be downloaded as premade Deephaven Notebooks. To learn more about using Deephaven Notebooks, please refer to the Notebook section of the User Manual.

Download the Query Cookbook Notebooks

This section demonstrates a variety of methods related to creating, manipulating and gathering information about tables.

Step 1: Pull trades from a certain date and remove some columns and rows.

trades = db.t("LearnDeephaven", "StockTrades").where("Date=`2017-08-25`")
     .view("Date", "ExchangeTimestamp", "Sym", "Last")
     .head(10)

Step 2:  Create a custom row.

The following query creates an empty table of size 1 (i.e., a table with one empty row) and populates it with the same columns and column value types as those found in the  trades table.

time = convertDateTime("2017-08-25T04:05:00 NY")
custom = emptyTable(1).updateView("Date=`2017-08-25`", "ExchangeTimestamp=time", "Sym=`AAPL`", "Last=160.0")

Step 3:  Add the custom row to the bottom of the trades table.

tradesNewRow = merge(trades, custom)

Step 4: Show the invisible index column.

Deephaven tables have an invisible column, i, which holds the position for each row. Let's make it visible.

indexes = tradesNewRow.updateView("Index=i")

Step 5: Utilize a column as an array of values by placing an underscore after column name.

The following query will create a new column showing the previous price for every trade:

prevPrice = indexes.updateView("PreviousPrice=Last_[i-1]")

Last_ allows you to access column values as if it's an array. The brackets allow access to a column value of your choosing. i - 1 is the previous index; remember that i represents current row position. Note that the first trade has no previous price because there are no previous trades before that time.

Step 6: Using the new column, calculate the difference between current and previous price.

difference = prevPrice.updateView("Difference=Last-PreviousPrice")

Step 7:  Show the table's metadata.

When integrating custom code into Deephaven queries, it may be useful to look at the table's metadata, which contains information about the table and the data it holds.

meta = difference.getMeta()

Step 8:  Finally, access a value from a single cell in the table.

For this example, the value in the 6th row (index value 5) from the Difference column will be shown in the Log panel.

cellValue = difference.getColumn("Difference").get(5)
println cellValue

Complete Code for the Table Tricks Recipe

trades = db.t("LearnDeephaven", "StockTrades").where("Date=`2017-08-25`")
     .view("Date", "ExchangeTimestamp", "Sym", "Last")
     .head(10)

time = convertDateTime("2017-08-25T04:05:00 NY")
custom = emptyTable(1).updateView("Date=`2017-08-25`", "ExchangeTimestamp=time", "Sym=`AAPL`", "Last=160.0")

tradesNewRow = merge(trades, custom)

indexes = tradesNewRow.updateView("Index=i")

prevPrice = indexes.updateView("PreviousPrice=Last_[i-1]")

difference = prevPrice.updateView("Difference=Last-PreviousPrice")

meta = difference.getMeta()

cellValue = difference.getColumn("Difference").get(5)
println cellValue
# Initial import

from deephaven import *

print("Provides:\n"
      "\tdeephaven.Calendars as cals\n"
      "\tdeephaven.ComboAggregateFactory as caf\n"
      "\tdeephaven.DBTimeUtils as dbtu\n"
      "\tdeephaven.Plot as plt\n"
      "\tdeephaven.Plot.figure_wrapper as figw\n"
      "\tdeephaven.QueryScope as qs\n"
      "\tdeephaven.TableManagementTools as tmt\n"
      "\tdeephaven.TableTools as ttools\n"
      "See print(sorted(dir())) for the full namespace contents.")


# Table Tricks

trades = db.t("LearnDeephaven", "StockTrades").where("Date=`2017-08-25`")\
    .view("Date", "ExchangeTimestamp", "Sym", "Last")\
    .head(10)

time = dbtu.convertDateTime("2017-08-25T04:05:00 NY")
custom = ttools.emptyTable(1).updateView("Date=`2017-08-25`", "ExchangeTimestamp=time", "Sym=`AAPL`", "Last=160.0")

tradesNewRow = ttools.merge(trades, custom)

indexes = tradesNewRow.updateView("Index=i")

prevPrice = indexes.updateView("PreviousPrice=Last_[i-1]")

difference = prevPrice.updateView("Difference=Last-PreviousPrice")

meta = difference.getMeta()

cellValue = difference.getColumn("Difference").get(5)
print (cellValue)

Bad Math

Be careful of invalid math operations like dividing by zero.

Step 1: Pull trades from a certain date and remove some columns.

trades = db.t("LearnDeephaven", "StockTrades").where("Date=`2017-08-25`")
.view("ExchangeTimestamp", "Sym", "Last", "Size")

Step 2:  Calculate the dollars involved in each trade.

dollarsInvolved = trades.updateView("DollarsInvolved=Last*Size")

Step 3:  Calculate the total dollars involved, total shares traded, and the last price for each symbol.

summary = dollarsInvolved.by(AggCombo(AggSum("TotalSize=Size", "TotalDollars=DollarsInvolved"), AggLast("Last")), "Sym")

In some cases, some rows may have a zero value for the size. While this dataset does not, we can add such a row to demonstrate challenges when working with zero values.

Step 4: Create a table with one row, with the same column names, to represent the SPX index, including a zero value for size. Then merge the new table to previous table.

spx = emptyTable(1).updateView("Sym=`SPX`", "TotalSize=(long)0", "TotalDollars=0.0", "Last=2465.84")
withSpx = merge(summary, spx)

Step 5:  Calculate the volume weighted average price (VWAP).

vwap = withSpx.updateView("VWAP=TotalDollars/TotalSize")

As shown, an empty cell was added to the SPX row. This is because there was division by a zero value. It is important to note that even though cells may appear empty, they always contain either a null or NaN value. You always want to avoid NaN values. The empty cell inthe vwap table contains a NaN value.

Step 6:  To demonstrate, perform a sum aggregation on VWAP.

sum = vwap.view("VWAP").sumBy()

Eleven cells have been aggregated, but because the empty cell contains a NaN value, the other ten values were discarded. This would not have happened had the cell contained a null value. Null values will not impact aggregation calculations. However, because NaN values are the result of invalid math operations, they will poison your calculations.

Step 7:  Calculate VWAP again, except this time take care when dividing by zero.

This time, we are using a ternary operator to ensure a zero value is not used as the denominator in the calculation. In essence, the if/then logic of the ternary operator in this query means: "if TotalSize is not zero, then VWAP will be TotalDollars divided by TotalSize; otherwise, VWAP will be null".

vwapIfThen = withSpx.updateView("VWAP =(TotalSize != 0) ? (TotalDollars/TotalSize) : NULL_DOUBLE")

An empty cell remains present in the table. However, we used the ternary operator to ensure the value was null and not NaN.

Step 8: Rerun the sum aggregation on VWAP.

sumIfThen = vwapIfThen.view("VWAP").sumBy()

This time the aggregation succeeded as intended. Note: You can also determine if your empty cells contain null values or not by using isNull. See below:

Step 9: Check for null values.

checkNull = vwapIfThen.updateView("NullOrNot=isNull(VWAP)")

Complete Code Blocks for the Bad Math Recipe

trades = db.t("LearnDeephaven", "StockTrades").where("Date=`2017-08-25`")
     .view("Date", "ExchangeTimestamp", "Sym", "Last", "Size")

dollarsInvolved = trades.updateView("DollarsInvolved=Last*Size")

summary = dollarsInvolved.by(AggCombo(AggSum("TotalSize=Size", "TotalDollars=DollarsInvolved"), AggLast("Last")), "Sym")

spx = emptyTable(1).updateView("Sym=`SPX`", "TotalSize=(long)0", "TotalDollars=0.0", "Last=2465.84")
withSpx = merge(summary, spx)

vwap = withSpx.updateView("VWAP=TotalDollars/TotalSize")

sum = vwap.view("VWAP").sumBy()

vwapIfThen = withSpx.updateView("VWAP =(TotalSize != 0) ? (TotalDollars/TotalSize) : NULL_DOUBLE")

sumIfThen = vwapIfThen.view("VWAP").sumBy()

checkNull = vwapIfThen.updateView("NullOrNot=isNull(VWAP)")
# Initial import

from deephaven import *

# Bad Math

trades = db.t("LearnDeephaven", "StockTrades").where("Date=`2017-08-25`")\
    .view("Date", "ExchangeTimestamp", "Sym", "Last", "Size")

dollarsInvolved = trades.updateView("DollarsInvolved=Last*Size")

summary = dollarsInvolved.by(caf.AggCombo(caf.AggSum("TotalSize=Size", "TotalDollars=DollarsInvolved"), caf.AggLast("Last")), "Sym")

spx = ttools.emptyTable(1).updateView("Sym=`SPX`", "TotalSize=(long)0", "TotalDollars=0.0", "Last=2465.84")
withSpx = ttools.merge(summary, spx)

vwap = withSpx.updateView("VWAP=TotalDollars/TotalSize")

sum = vwap.view("VWAP").sumBy()

vwapIfThen = withSpx.updateView("VWAP = (TotalSize != 0) ? (TotalDollars/TotalSize) : NULL_DOUBLE")

sumIfThen = vwapIfThen.view("VWAP").sumBy()

checkNull = vwapIfThen.updateView("NullOrNot=isNull(VWAP)")

Custom Functionality

Deephaven enables users to write and implement custom functions for use with queries. In this recipe, a function is created to calculate rolling sums for the number of shares traded at each row.

Step 1:  Pull trades from a certain date and remove some columns and rows.

trades = db.t("LearnDeephaven", "StockTrades").where("Date=`2017-08-25`")
     .view("Sym", "Size")
     .headBy(100, "Sym")

Step 2:  Before we write our function, group together each symbol's values.

symGroups = trades.by("Sym")

Ultimately, we're going to add an array to this table containing a rolling sum for each trade.

Step 3:  Create the function.

Our function will be called rollingSum and it will need a number of rows and an array of values.

rollingSum = { rows, values ->
     // Create a new array that will store rolling sum calculations
     calculations = new int[values.size()]

     // Create a running sum
     sum = 0

     //Iterate through each value in the array
     for (int i = 0; i < values.size(); ++i)
     {
          // Add the current value to the running sum
          sum += values.get(i)

          // Subtract the outdated value
          if (i >= rows) sum -= values.get(i - rows)

          // Store the rolling sum upon each iteration
          calculations[i] = sum
     }

     // Return the array of rolling sums
     return calculations
}

Step 4:  Use the new function to add a new column.

Note:  int[] makes sure that the resulting array is full of integers.

sums = symGroups.updateView("RollingSum=(int[]) rollingSum.call(20, Size)")

Step 5:  Distribute the arrays into their own rows.

ungroup = sums.ungroup()

As you can see, every row for every symbol has its own up-to-date rolling sum.

Complete Code Blocks for the Custom Functionality Recipe

trades = db.t("LearnDeephaven", "StockTrades").where("Date=`2017-08-25`")
     .view("Sym", "Size")
     .headBy(100, "Sym")
symGroups = trades.by("Sym")

rollingSum = { rows, values ->
     // Create a new array that will store rolling sum calculations
     calculations = new int[values.size()]

     // Create a running sum
     sum = 0

     //Iterate through each value in the array
     for (int i = 0; i < values.size(); ++i)
     {
          // Add the current value to the running sum
          sum += values.get(i)

          // Subtract the outdated value
          if (i >= rows) sum -= values.get(i - rows)

          // Store the rolling sum upon each iteration
          calculations[i] = sum
     }

     // Return the array of rolling sums
     return calculations
}

sums = symGroups.updateView("RollingSum=(int[]) rollingSum.call(20, Size)")

ungroup = sums.ungroup()
# Initial import

from deephaven import *

# Custom Functionality

trades = db.t("LearnDeephaven", "StockTrades").where("Date=`2017-08-25`")\
    .view("Sym", "Size")\
    .headBy(100, "Sym")

symGroups = trades.by("Sym")


def rollingSum(rows, values):
    calculations = jpy.array('int', values.size())  # create an array of integers
    sum = 0  # our running sum
    for i in range(values.size()):
        sum += values.get(i)
        if (i >= rows): sum -= values.get(i - rows)  # subtract when needed
        calculations[i] = sum  # store a rolling sum for every value

    return calculations  # return an array of rolling sum calculations

sums = symGroups.updateView("RollingSum=(int[]) rollingSum.call(20, Size)")
ungroup = sums.ungroup()


Last Updated: 28 February 2020 12:20 -05:00 UTC    Deephaven v.1.20200121  (See other versions)

Deephaven Documentation     Copyright 2016-2020  Deephaven Data Labs, LLC     All Rights Reserved