deephaven

Main Deephaven python module.

For convenient usage in the python console, the main sub-packages of deephaven have been imported here with aliases:

  • Calendars imported as cals

  • ComboAggregateFactory imported as caf

  • DBTimeUtils imported as dbtu

  • MovingAverages imported as mavg

  • npy as npy

  • Plot imported as plt

  • QueryScope imported as qs

  • TableManagementTools imported as tmt

  • TableTools imported as ttools (`tt` is frequently used for time table)

Additionally, the following methods have been imported into the main deephaven namespace:

  • from Plot import figure_wrapper as figw

  • from java_to_python import tableToDataFrame, columnToNumpyArray, convertJavaArray

  • from python_to_java import dataFrameToTable, createTableFromData

  • from conversion_utils import convertToJavaArray, convertToJavaList, convertToJavaArrayList,

    convertToJavaHashSet, convertToJavaHashMap

  • from ExportTools import JdbcLogger, TableLoggerBase

  • from ImportTools import CsvImport, DownsampleImport, JdbcHelpers, JdbcImport, JsonImport,

    MergeData, XmlImport

  • from InputTableTools import InputTable, TableInputHandler, LiveInputTableEditor

  • from TableManipulation import ColumnRenderersBuilder, DistinctFormatter,

    DownsampledWhereFilter, LayoutHintBuilder, PersistentQueryTableHelper, PivotWidgetBuilder, SmartKey, SortPair, TotalsTableBuilder, WindowCheck

For ease of namespace population in a python console, consider:

>>> from deephaven import *  # this will import the submodules into the main namespace
>>> print(dir())  # this will display the contents of the main namespace
>>> help(plt)  # will display the help entry (doc strings) for the illumon.iris.plot module
>>> help(columnToNumpyArray)  # will display the help entry for the columnToNumpyArray method
class DeephavenDb

DeephavenDb session

db()

Gets the Deephaven database object

execute(groovy)

Executes Deephaven groovy code from a snippet/string

Parameters

groovy – groovy code

executeFile(file)

Executes Deephaven groovy code from a file

Parameters

file – the file

get(variable)

Gets a variable from the groovy session

Parameters

variable – variable name

Returns

the value

getDf(variable)

Gets a Table as a pandas.DataFrame from the console session

Parameters

variable – the variable (table) name

Returns

pandas.DataFrame instance representing Table specified by variable

pushDf(name, df)

Pushes a pandas.DataFrame to the Deephaven groovy session as a Table

Parameters
  • name – the destination variable name in the groovy session

  • dfpandas.DataFrame instance

reconnect()

Disconnects/shuts down the current session, and establishes a new session

Warning

The current Deephaven state will be lost

class PersistentQueryControllerClient(log=None, logLevel='INFO')

A client for the persistent query controller

classmethod getControllerClient(log=None, logLevel='INFO')

Creates a connection to the controller client.

Note:

  • log takes precedence over logLevel. Default will be logLevel=INFO if not provided.

Parameters
  • log – None or the desired Java logger object

  • logLevel – if log is None, the desired log level for the java logger constructed for this

Returns

persistent query controller client

getPersistentQueryConfiguration(log=None, logLevel='INFO', configSerial=None, owner=None, name=None)

Gets the configuration for a persistent query.

Note:

  • log takes precedence over logLevel. Default will be logLevel=INFO if not provided.

  • configSerial takes precedence over owner & name, but one valid choice must be provided.

Parameters
  • log – None or the desired Java logger object

  • logLevel – if log is None, the desired log level for the java logger constructed for this

  • configSerial – the serial number of the persistent query

  • owner – the owner of the persistent query

  • name – the name of the persistent query

Returns

the PersistentQueryConfiguration

publishTemporaryQueries(*args, **kwargs)

Publishes one or many configurations as temporary persistent queries

Note:

  • log takes precedence over logLevel. Default will be logLevel=INFO if not provided.

Parameters
  • args – the configurations collection

  • kwargs – can only contain keyword arguments ‘log’ or ‘logLevel’. log - if present, is the the desired Java logger object. logLevel - is the desired log level for the constructed java logger. if log is None or not present, [default is ‘INFO’].

PythonFunction(func, classString)

Constructs a Java Function<PyObject, Object> implementation from the given python function func. The proper Java object interpretation for the return of func must be provided.

Parameters
  • func – Python callable or class instance with apply method (single argument)

  • classString – the fully qualified class path of the return for func. This is really anticipated to be one of java.lang.String, double, ‘float`, long, int, short, byte, or boolean, and any other value will result in java.lang.Object and likely be unusable.

Returns

com.illumon.integrations.python.PythonFunction instance, primarily intended for use in PivotWidgetBuilder usage

PythonListenerAdapter(dynamicTable, implementation, description=None, retain=True, replayInitialImage=False)

Constructs the InstrumentedListenerAdapter, implemented in Python, and plugs it into the table’s listenForUpdates method.

Parameters
  • dynamicTable – table to which to listen - NOTE: it will be cast to DynamicTable.

  • implementation – the body of the implementation for the InstrumentedListenerAdapter.onUpdate method, and must either be a class with onUpdate method or a callable.

  • description – A description for the UpdatePerformanceTracker to append to its entry description.

  • retain – Whether a hard reference to this listener should be maintained to prevent it from being collected.

  • replayInitialImage – False to only process new rows, ignoring any previously existing rows in the Table; True to process updates for all initial rows in the table PLUS all new row changes.

PythonShiftAwareListenerAdapter(dynamicTable, implementation, description=None, retain=True)

Constructs the InstrumentedShiftAwareListenerAdapter, implemented in Python, and plugs it into the table’s listenForUpdates method.

Parameters
  • dynamicTable – table to which to listen - NOTE: it will be cast to DynamicTable.

  • implementation – the body of the implementation for the InstrumentedShiftAwareListenerAdapter.onUpdate method, and must either be a class with onUpdate method or a callable.

  • description – A description for the UpdatePerformanceTracker to append to its entry description.

  • retain – Whether a hard reference to this listener should be maintained to prevent it from being collected.

class TableListenerHandle(t, listener)

A handle for a table listener.

deregister()

Deregister the listener from the table and stop listening for updates.

register()

Register the listener with the table and listen for updates.

columnToNumpyArray(table, columnName, convertNulls=0, forPandas=False)

Produce a copy of the specified column as a numpy.ndarray.

Parameters
  • table – the Table object

  • columnName – the name of the desired column

  • convertNulls – member of NULL_CONVERSION enum, specifying how to treat null values. Can be specified by string value (i.e. 'ERROR'), enum member (i.e. NULL_CONVERSION.PASS), or integer value (i.e. 2)

  • forPandas – boolean for whether the output will be fed into a pandas.Series (i.e. must be 1-dimensional)

Returns

numpy.ndarray object which reproduces the given column of table (as faithfully as possible)

Note that the entire column is going to be cloned into memory, so the total number of entries in the column should be considered before blindly doing this. For large tables (millions of entries or more?), consider measures such as down-selecting rows using the Deephaven query language before converting.

Warning

The table will be frozen prior to conversion. A table which updates mid-conversion would lead to errors or other undesirable behavior.

The value for convertNulls only applies to java integer type (byte, short, int, long) or java.lang.Boolean array types:

  • NULL_CONVERSION.ERROR (=0) [default] inspect for the presence of null values, and raise an exception if one is encountered.

  • NULL_CONVERSION.PASS (=1) do not inspect for the presence of null values, and pass value straight through without interpretation (Boolean null -> False). This is intended for conversion which is as fast as possible. No warning is generated if null value(s) present, since no inspection is performed.

  • NULL_CONVERSION.CONVERT (=2) inspect for the presence of null values, and take steps to return the closest analogous numpy alternative (motivated by pandas behavior):

    • integer type columns with null value(s), the numpy.ndarray will have float-point type and null values will be replaced with NaN

    • Boolean type columns with null value(s), the numpy.ndarray will have numpy.object type and null values will be None.

Type mapping will be performed as indicated here:

  • byte -> numpy.int8, or numpy.float32 if necessary for null conversion

  • short -> numpy.int16, or numpy.float32 if necessary for null conversion

  • int -> numpy.int32, or numpy.float64 if necessary for null conversion

  • long -> numpy.int64, or numpy.float64 if necessary for null conversion

  • Boolean -> numpy.bool, or numpy.object if necessary for null conversion

  • float -> numpy.float32 and NULL_FLOAT -> numpy.nan

  • double -> numpy.float64 and NULL_DOUBLE -> numpy.nan

  • DBDateTime -> numpy.dtype(datetime64[ns]) and null -> numpy.nat

  • String -> numpy.unicode_ (of appropriate length) and null -> ''

  • char -> numpy.dtype('U1') (one character string) and NULL_CHAR -> ''

  • array/DbArray
    • if forPandas=False and all entries are of compatible shape, then will return a rectangular numpy.ndarray of dtype in keeping with the above

    • if forPandas=False or all entries are not of compatible shape, then returns one-diemnsional numpy.ndarray with dtype numpy.object, with each entry numpy.ndarray and type mapping in keeping with the above

  • Anything else should present as a one-dimensional array of type numpy.object with entries uninterpreted except by the jpy JNI layer.

Note

The numpy unicode type uses 32-bit characters (there is no 16-bit option), and is implemented as a character array of fixed-length entries, padded as necessary by the null character (i.e. character of integer value 0). Every entry in the array will actually use as many characters as the longest entry, and the numpy fetch of an entry automatically trims the trailing null characters.

This will require much more memory (doubles bit-depth and pads all strings to the length of the longest) in python versus a corresponding java String array. If the original java String has any trailing null (zero-value) characters, these will be ignored in python usage. For char arrays, we cannot differentiate between entries whose original value (in java) was 0 or NULL_CHAR.

convertJavaArray(javaArray, convertNulls='ERROR', forPandas=False)

Converts a java array to it’s closest numpy.ndarray alternative.

Parameters
  • javaArray – input java array or dbarray object

  • convertNulls – member of NULL_CONVERSION enum, specifying how to treat null values. Can be specified by string value (i.e. 'ERROR'), enum member (i.e. NULL_CONVERSION.PASS), or integer value (i.e. 2)

  • forPandas – boolean indicating whether output will be fed into a pandas.Series, which requires that the underlying data is one-dimensional

Returns

numpy.ndarray representing as faithful a copy of the java array as possible

The value for convertNulls only applies to java integer type (byte, short, int, long) or java.lang.Boolean array types:

  • NULL_CONVERSION.ERROR (=0) [default] inspect for the presence of null values, and raise an exception if one is encountered.

  • NULL_CONVERSION.PASS (=1) do not inspect for the presence of null values, and pass value straight through without interpretation (Boolean null -> False). This is intended for conversion which is as fast as possible. No warning is generated if null value(s) present, since no inspection is performed.

  • NULL_CONVERSION.CONVERT (=2) inspect for the presence of null values, and take steps to return the closest analogous numpy alternative (motivated by pandas behavior):

    • integer type columns with null value(s), the numpy.ndarray will have float-point type and null values will be replaced with NaN

    • Boolean type columns with null value(s), the numpy.ndarray will have numpy.object type and null values will be None.

Type mapping will be performed as indicated here:

  • byte -> numpy.int8, or numpy.float32 if necessary for null conversion

  • short -> numpy.int16, or numpy.float32 if necessary for null conversion

  • int -> numpy.int32, or numpy.float64 if necessary for null conversion

  • long -> numpy.int64, or numpy.float64 if necessary for null conversion

  • Boolean -> numpy.bool, or numpy.object if necessary for null conversion

  • float -> numpy.float32 and NULL_FLOAT -> numpy.nan

  • double -> numpy.float64 and NULL_DOUBLE -> numpy.nan

  • DBDateTime -> numpy.dtype(datetime64[ns]) and null -> numpy.nat

  • String -> numpy.unicode_ (of appropriate length) and null -> ''

  • char -> numpy.dtype('U1') (one character string) and NULL_CHAR -> ''

  • array/DbArray
    • if forPandas=False and all entries are of compatible shape, then will return a rectangular numpy.ndarray of dtype in keeping with the above

    • if forPandas=False or all entries are not of compatible shape, then returns one-diemnsional numpy.ndarray with dtype numpy.object, with each entry numpy.ndarray and type mapping in keeping with the above

  • Anything else should present as a one-dimensional array of type numpy.object with entries uninterpreted except by the jpy JNI layer.

Note

The numpy unicode type uses 32-bit characters (there is no 16-bit option), and is implemented as a character array of fixed-length entries, padded as necessary by the null character (i.e. character of integer value 0). Every entry in the array will actually use as many characters as the longest entry, and the numpy fetch of an entry automatically trims the trailing null characters.

This will require much more memory (doubles bit-depth and pads all strings to the length of the longest) in python versus a corresponding java String array. If the original java String has any trailing null (zero-value) characters, these will be ignored in python usage. For char arrays, we cannot differentiate between entries whose original value (in java) was 0 or NULL_CHAR.

convertToJavaArray(input, boxed=False)

Convert the input list/tuple or numpy array to a java array. The main utility of the method is likely providing an object required for a specific java function signature. This is a helper method unifying hung on top of makeJavaArray() and ultimately defined via the proper jpy.array call.

A string will be treated as a character array (i.e. java char array). The user has most control over type in providing a numpy.ndarray, otherwise the contents will be converted to a numpy.ndarray first.

numpy.ndarray Type Conversion:

  • basic numpy primitive dtype are converted to their java analog, NaN values in floating point columns are converted to their respective Deephaven NULL constant values.

  • dtype datatime64[*] are converted to DBDateTime

  • dtype of one of the basic string type *(unicode*, str*, bytes*):
    • if all elements are one character long: converted to char array

    • otherwise, String array

  • ndarrays of dtype numpy.object:

    • ndarrays which are empty or all elements are null are converted to java type Object.

    • Otherwise, the first non-null value is used to determine the type for the column.

    If the first non-null element is an instance of:

    • bool - converted to Boolean array with null values preserved

    • str - converted to String array with null values as empty string

    • datetime.date or datetime.datetime - the array is converted to DBDateTime

    • numpy.ndarray - converted to java array. All elements are assumed null, or ndarray of the same type and compatible shape, or an exception will be raised.

    • dict - unsupported

    • other iterable type - naive conversion to numpy.ndarray is attempted, then as above.

    • any other type:

      • convertUnknownToString=True - attempt to naively convert to String array

      • otherwise, raise exception

  • ndarrays of any other dtype (namely complex*, uint*, void*, or custom dtypes):

    1. convertUnknownToString=True - attempt to convert to column of string type

    2. otherwise, raise exception

Parameters
  • input – string, tuple/list or numpy.ndarray - java type will be inferred

  • boxed – should we ensure that the constructed array is of boxed type?

Returns

a java array instance

convertToJavaArrayList(input)

Convert the input list/tuple or numpy array to a java ArrayList. The main utility of the method is likely providing an object required for a specific java function signature.

Parameters

input – tuple/list or numpy.ndarray - java type will be inferred

Returns

best java.util.ArrayList

Note

The user has most control over type in providing a numpy.ndarray, otherwise the contents will be converted to a numpy.ndarray first. Type mapping will be determined as in convertToJavaArray()

convertToJavaHashMap(input1, input2=None)

Create a java hashmap from provided input of the form (input1=keys, input2=values) or (input1=dict).

Parameters
  • input1 – dict or tuple/list/numpy.ndarray, assumed to be keys

  • input2 – ignored if input1 is a dict, otherwise tuple/list/numpy.ndarray, assumed to be values

Returns

best java.util.HashMap

Note

The user has most control over type in providing keys and values as numpy.ndarray. Otherwise, the keys and values will be converted to a numpy.ndarray first. Type mapping will be determined as in convertToJavaArray()

convertToJavaHashSet(input)

Convert the input list/tuple or numpy array to a java HashSet. The main utility of the method is likely for quick inclusion check inside of a deephaven query.

Parameters

input – tuple/list or numpy.ndarray - java type will be inferred

Returns

best java.util.ArrayList

Note

The user has most control over type in providing a numpy.ndarray, otherwise the contents will be converted to a numpy.ndarray first. Type mapping will be determined as in convertToJavaArray()

convertToJavaList(input)

Convert the input list/tuple or numpy array to a (fixed size) java List. The main utility of the method is likely providing an object required for a specific java function signature.

Parameters

input – tuple/list or numpy.ndarray - java type will be inferred

Returns

best java.util.List

Note

The user has most control over type in providing a numpy.ndarray, otherwise the contents will be converted to a numpy.ndarray first. Type mapping will be determined as in makeJavaArray()

createTableFromData(data, columns=None, convertUnknownToString=False)

Create a deephaven table object from a collection of column data

Parameters
  • data – a dict of the form {column_name: column_data} or list of the form [column0_data, column1_data, …]

  • columns – a list of column names to use

  • convertUnknownToString – option for whether to attempt to convert unknown elements to a column of string type

Returns

the deephaven table

If data is a dict and columns is given, then only data corresponding to the names in columns is used. If data is a list of column data and columns is not given, then column names will be provided as col_0, col_1, …

Type Conversion:

  • Columns which are an instance of tuple or list are first naively converted to numpy.ndarray

  • Columns basic primitive type are converted to their java analog, NaN values in floating point columns are converted to their respective Deephaven NULL constant values.

  • Columns of underlying type datatime64[*] are converted to DBDateTime

  • Columns of one of the basic string type (unicode*, str*, bytes*) are converted to String

  • Columns of type numpy.object - arrays which are empty or all elements are null are converted to java type Object. Otherwise, the first non-null value is used to determine the type for the column.

    If the first non-null element is an instance of:

    • bool - the array is converted to Boolean with null values preserved

    • str - the array is converted to a column of String type

    • datetime.date or datetime.datetime - the array is converted to DBDateTime

    • numpy.ndarray - all elements are assumed null, or ndarray of the same type and compatible shape, or an exception will be raised.

      • if one-dimensional, then column of appropriate DbArray type

      • otherwise, column of java array type

    • dict - unsupported

    • other iterable type - naive conversion to numpy.ndarray is attempted, then as above.

    • any other type:

      • convertUnknownToString=True - attempt to convert to column of string type

      • otherwise, raise exception

  • Columns of any other type (namely complex*, uint*, void*, or custom dtypes):

    1. convertUnknownToString=True - attempt to convert to column of string type

    2. otherwise, raise exception

dataFrameToTable(dataframe, convertUnknownToString=False)

Converts the provided pandas.DataFrame object to a deephaven table object.

Parameters
  • dataframepandas.DataFrame object

  • convertUnknownToString – option for whether to attempt to convert unknown elements to a column of string type

Returns

Table object, which represents dataframe as faithfully as possible

Type Conversion:

  • Columns basic primitive type are converted to their java analog, NaN values in floating point columns are converted to their respective Deephaven NULL constant values.

  • Columns of underlying type datatime64[*] are converted to DBDateTime

  • Columns of one of the basic string type (unicode*, str*, bytes*) are converted to String

  • Columns of type numpy.object - arrays which are empty or all elements are null are converted to java type Object. Otherwise, the first non-null value is used to determine the type for the column.

    If the first non-null element is an instance of:

    • bool - the array is converted to Boolean with null values preserved

    • str - the array is converted to a column of String type

    • datetime.date or datetime.datetime - the array is converted to DBDateTime

    • numpy.ndarray - all elements are assumed null, or ndarray of the same type and compatible shape, or an exception will be raised.

      • if one-dimensional, then column of appropriate DbArray type

      • otherwise, column of java array type

    • dict - unsupported

    • other iterable type - naive conversion to numpy.ndarray is attempted, then as above.

    • any other type:

      • convertUnknownToString=True - attempt to convert to column of string type

      • otherwise, raise exception

  • Columns of any other type (namely complex*, uint*, void*, or custom dtypes):

    1. convertUnknownToString=True - attempt to convert to column of string type

    2. otherwise, raise exception

doLocked(f, lock_type='exclusive')

Executes a function while holding the LiveTableMonitor (LTM) lock. Holding the LTM lock ensures that the contents of a table will not change during a computation, but holding the lock also prevents table updates from happening. The lock should be held for as little time as possible.

Parameters
  • f – function to execute while holding the LTM lock. f must be callable or have an apply attribute which is callable.

  • lock_type – LTM lock type. Valid values are “exclusive” and “shared”. “exclusive” allows only a single reader or writer to hold the lock. “shared” allows multiple readers or a single writer to hold the lock.

listen(t, listener, description=None, retain=True, ltype='auto', start_listening=True, replay_initial=False, lock_type='exclusive')

Listen to table changes.

Parameters
  • t – dynamic table to listen to.

  • listener – listener to process changes. This can either be a function or an object with an “onUpdate” method.

  • description – description for the UpdatePerformanceTracker to append to the listener’s entry description.

  • retain – whether a hard reference to this listener should be maintained to prevent it from being collected.

  • ltype – listener type. Valid values are None, “auto”, “legacy”, and “shift_aware”. None and “auto” (default) use inspection to automatically determine the type of input listener. “legacy” is for a legacy PythonListenerAdapter, which takes three arguments (added, removed, modified). “shift_aware” is for a PythonShiftAwareListenerAdapter, which takes one argument (update).

  • start_listening – True to create the listener and register the listener with the table. The listener will see updates. False to create the listener, but do not register the listener with the table. The listener will not see updates.

  • replay_initial – True to replay the initial table contents to the listener. False to only listen to new table changes.

  • lock_type – LTM lock type. Used when replay_initial=True. See doLocked() for valid values.

Returns

table listener handle.

tableToDataFrame(table, convertNulls=0, categoricals=None)

Produces a copy of a table object as a pandas.DataFrame.

Parameters
  • table – the Table object

  • convertNulls – member of NULL_CONVERSION enum, specifying how to treat null values.

  • categoricals – None, column name, or list of column names to convert a ‘categorical’ data series

Returns

pandas.Dataframe object which reproduces table as faithfully as possible

Note that the entire table is going to be cloned into memory, so the total number of entries in the table should be considered before blindly doing this. For large tables (millions of entries or more?), consider measures such as dropping unnecessary columns and/or down-selecting rows using the Deephaven query language before converting.

Warning

The table will be frozen prior to conversion. A table which updates mid-conversion would lead to errors or other undesirable behavior.

The value for convertNulls only applies to java integer type (byte, short, int, long) or java.lang.Boolean array types:

  • NULL_CONVERSION.ERROR (=0) [default] inspect for the presence of null values, and raise an exception if one is encountered.

  • NULL_CONVERSION.PASS (=1) do not inspect for the presence of null values, and pass value straight through without interpretation (Boolean null -> False). This is intended for conversion which is as fast as possible. No warning is generated if null value(s) present, since no inspection is performed.

  • NULL_CONVERSION.CONVERT (=2) inspect for the presence of null values, and take steps to return the closest analogous numpy alternative (motivated by pandas behavior):

    • integer type columns with null value(s), the numpy.ndarray will have float-point type and null values will be replaced with NaN

    • Boolean type columns with null value(s), the numpy.ndarray will have numpy.object type and null values will be None.

Conversion for different data types will be performed as indicated here:

  • byte -> numpy.int8, or numpy.float32 if necessary for null conversion

  • short -> numpy.int16, or numpy.float32 if necessary for null conversion

  • int -> numpy.int32, or numpy.float64 if necessary for null conversion

  • long -> numpy.int64, or numpy.float64 if necessary for null conversion

  • Boolean -> numpy.bool, or numpy.object if necessary for null conversion

  • float -> numpy.float32 and NULL_FLOAT -> numpy.nan

  • double -> numpy.float64 and NULL_DOUBLE -> numpy.nan

  • DBDateTime -> numpy.dtype(datetime64[ns]) and null -> numpy.nat

  • String -> numpy.unicode_ (of appropriate length) and null -> ''

  • char -> numpy.dtype('U1') (one character string) and NULL_CHAR -> ''

  • array/DbArray
    • if forPandas=False and all entries are of compatible shape, then will return a rectangular numpy.ndarray of dtype in keeping with the above

    • if forPandas=False or all entries are not of compatible shape, then returns one-diemnsional numpy.ndarray with dtype numpy.object, with each entry numpy.ndarray and type mapping in keeping with the above

  • Anything else should present as a one-dimensional array of type numpy.object with entries uninterpreted except by the jpy JNI layer.

Note

The numpy unicode type uses 32-bit characters (there is no 16-bit option), and is implemented as a character array of fixed-length entries, padded as necessary by the null character (i.e. character of integer value 0). Every entry in the array will actually use as many characters as the longest entry, and the numpy fetch of an entry automatically trims the trailing null characters.

This will require much more memory (doubles bit-depth and pads all strings to the length of the longest) in python versus a corresponding java String array. If the original java String has any trailing null (zero-value) characters, these will be ignored in python usage. For char arrays, we cannot differentiate between entries whose original value (in java) was 0 or NULL_CHAR.