deephaven¶

Main Deephaven python module.

For convenient usage in the python console, the main sub-packages of deephaven have been imported here with aliases:

Calendars imported as cals
ComboAggregateFactory imported as caf
DBTimeUtils imported as dbtu
MovingAverages imported as mavg
npy as npy
Plot imported as plt
QueryScope imported as qs
TableManagementTools imported as tmt
TableTools imported as ttools (`tt` is frequently used for time table)

Additionally, the following methods have been imported into the main deephaven namespace:

from Plot import figure_wrapper as figw
from java_to_python import tableToDataFrame, columnToNumpyArray, convertJavaArray
from python_to_java import dataFrameToTable, createTableFromData
from conversion_utils import convertToJavaArray, convertToJavaList, convertToJavaArrayList,
convertToJavaHashSet, convertToJavaHashMap
from ExportTools import JdbcLogger, TableLoggerBase
from ImportTools import CsvImport, DownsampleImport, JdbcHelpers, JdbcImport, JsonImport,
MergeData, XmlImport
from InputTableTools import InputTable, TableInputHandler, LiveInputTableEditor
from TableManipulation import ColumnRenderersBuilder, DistinctFormatter,
DownsampledWhereFilter, LayoutHintBuilder, PersistentQueryTableHelper, PivotWidgetBuilder, SmartKey, SortPair, TotalsTableBuilder, WindowCheck

For ease of namespace population in a python console, consider:

>>> from deephaven import *  # this will import the submodules into the main namespace
>>> print(dir())  # this will display the contents of the main namespace
>>> help(plt)  # will display the help entry (doc strings) for the illumon.iris.plot module
>>> help(columnToNumpyArray)  # will display the help entry for the columnToNumpyArray method

class DeephavenDb¶

DeephavenDb session

db()¶: Gets the Deephaven database object

execute(groovy)¶

Executes Deephaven groovy code from a snippet/string

Parameters: groovy – groovy code

executeFile(file)¶

Executes Deephaven groovy code from a file

Parameters: file – the file

get(variable)¶

Gets a variable from the groovy session

Parameters: variable – variable name
Returns: the value

getDf(variable)¶

Gets a Table as a pandas.DataFrame from the console session

Parameters: variable – the variable (table) name
Returns: pandas.DataFrame instance representing Table specified by variable

pushDf(name, df)¶

Pushes a pandas.DataFrame to the Deephaven groovy session as a Table

Parameters

name – the destination variable name in the groovy session
df – pandas.DataFrame instance

reconnect()¶: Disconnects/shuts down the current session, and establishes a new session

Warning

The current Deephaven state will be lost

class PersistentQueryControllerClient(log=None, logLevel='INFO')¶

A client for the persistent query controller

classmethod getControllerClient(log=None, logLevel='INFO')¶

Creates a connection to the controller client.

Note:

log takes precedence over logLevel. Default will be logLevel=INFO if not provided.

Parameters

log – None or the desired Java logger object
logLevel – if log is None, the desired log level for the java logger constructed for this

Returns

persistent query controller client

getPersistentQueryConfiguration(log=None, logLevel='INFO', configSerial=None, owner=None, name=None)¶

Gets the configuration for a persistent query.

Note:

log takes precedence over logLevel. Default will be logLevel=INFO if not provided.
configSerial takes precedence over owner & name, but one valid choice must be provided.

Parameters

log – None or the desired Java logger object
logLevel – if log is None, the desired log level for the java logger constructed for this
configSerial – the serial number of the persistent query
owner – the owner of the persistent query
name – the name of the persistent query

Returns

the PersistentQueryConfiguration

publishTemporaryQueries(*args, **kwargs)¶

Publishes one or many configurations as temporary persistent queries

Note:

log takes precedence over logLevel. Default will be logLevel=INFO if not provided.

Parameters

args – the configurations collection
kwargs – can only contain keyword arguments ‘log’ or ‘logLevel’. log - if present, is the the desired Java logger object. logLevel - is the desired log level for the constructed java logger. if log is None or not present, [default is ‘INFO’].

PythonFunction(func, classString)¶

Constructs a Java Function<PyObject, Object> implementation from the given python function func. The proper Java object interpretation for the return of func must be provided.

Parameters

func – Python callable or class instance with apply method (single argument)
classString – the fully qualified class path of the return for func. This is really anticipated to be one of java.lang.String, double, ‘float`, long, int, short, byte, or boolean, and any other value will result in java.lang.Object and likely be unusable.

Returns

com.illumon.integrations.python.PythonFunction instance, primarily intended for use in PivotWidgetBuilder usage

PythonListenerAdapter(dynamicTable, implementation, description=None, retain=True, replayInitialImage=False)¶

Constructs the InstrumentedListenerAdapter, implemented in Python, and plugs it into the table’s listenForUpdates method.

Parameters

dynamicTable – table to which to listen - NOTE: it will be cast to DynamicTable.
implementation – the body of the implementation for the InstrumentedListenerAdapter.onUpdate method, and must either be a class with onUpdate method or a callable.
description – A description for the UpdatePerformanceTracker to append to its entry description.
retain – Whether a hard reference to this listener should be maintained to prevent it from being collected.
replayInitialImage – False to only process new rows, ignoring any previously existing rows in the Table; True to process updates for all initial rows in the table PLUS all new row changes.

PythonShiftAwareListenerAdapter(dynamicTable, implementation, description=None, retain=True)¶

Constructs the InstrumentedShiftAwareListenerAdapter, implemented in Python, and plugs it into the table’s listenForUpdates method.

Parameters

dynamicTable – table to which to listen - NOTE: it will be cast to DynamicTable.
implementation – the body of the implementation for the InstrumentedShiftAwareListenerAdapter.onUpdate method, and must either be a class with onUpdate method or a callable.
description – A description for the UpdatePerformanceTracker to append to its entry description.
retain – Whether a hard reference to this listener should be maintained to prevent it from being collected.

class TableListenerHandle(t, listener)¶

A handle for a table listener.

deregister()¶: Deregister the listener from the table and stop listening for updates.

register()¶: Register the listener with the table and listen for updates.

columnToNumpyArray(table, columnName, convertNulls=0, forPandas=False)¶

Produce a copy of the specified column as a numpy.ndarray.

Parameters

table – the Table object
columnName – the name of the desired column
convertNulls – member of NULL_CONVERSION enum, specifying how to treat null values. Can be specified by string value (i.e. 'ERROR'), enum member (i.e. NULL_CONVERSION.PASS), or integer value (i.e. 2)
forPandas – boolean for whether the output will be fed into a pandas.Series (i.e. must be 1-dimensional)

Returns

numpy.ndarray object which reproduces the given column of table (as faithfully as possible)

Note that the entire column is going to be cloned into memory, so the total number of entries in the column should be considered before blindly doing this. For large tables (millions of entries or more?), consider measures such as down-selecting rows using the Deephaven query language before converting.

Warning

The table will be frozen prior to conversion. A table which updates mid-conversion would lead to errors or other undesirable behavior.

The value for convertNulls only applies to java integer type (byte, short, int, long) or java.lang.Boolean array types:

NULL_CONVERSION.ERROR (=0) [default] inspect for the presence of null values, and raise an exception if one is encountered.
NULL_CONVERSION.PASS (=1) do not inspect for the presence of null values, and pass value straight through without interpretation (Boolean null -> False). This is intended for conversion which is as fast as possible. No warning is generated if null value(s) present, since no inspection is performed.
NULL_CONVERSION.CONVERT (=2) inspect for the presence of null values, and take steps to return the closest analogous numpy alternative (motivated by pandas behavior):
- integer type columns with null value(s), the numpy.ndarray will have float-point type and null values will be replaced with NaN
- Boolean type columns with null value(s), the numpy.ndarray will have numpy.object type and null values will be None.

Type mapping will be performed as indicated here:

byte -> numpy.int8, or numpy.float32 if necessary for null conversion
short -> numpy.int16, or numpy.float32 if necessary for null conversion
int -> numpy.int32, or numpy.float64 if necessary for null conversion
long -> numpy.int64, or numpy.float64 if necessary for null conversion
Boolean -> numpy.bool, or numpy.object if necessary for null conversion
float -> numpy.float32 and NULL_FLOAT -> numpy.nan
double -> numpy.float64 and NULL_DOUBLE -> numpy.nan
DBDateTime -> numpy.dtype(datetime64[ns]) and null -> numpy.nat
String -> numpy.unicode_ (of appropriate length) and null -> ''
char -> numpy.dtype('U1') (one character string) and NULL_CHAR -> ''
array/DbArray
- if forPandas=False and all entries are of compatible shape, then will return a rectangular numpy.ndarray of dtype in keeping with the above
- if forPandas=False or all entries are not of compatible shape, then returns one-diemnsional numpy.ndarray with dtype numpy.object, with each entry numpy.ndarray and type mapping in keeping with the above
Anything else should present as a one-dimensional array of type numpy.object with entries uninterpreted except by the jpy JNI layer.

Note

The numpy unicode type uses 32-bit characters (there is no 16-bit option), and is implemented as a character array of fixed-length entries, padded as necessary by the null character (i.e. character of integer value 0). Every entry in the array will actually use as many characters as the longest entry, and the numpy fetch of an entry automatically trims the trailing null characters.

This will require much more memory (doubles bit-depth and pads all strings to the length of the longest) in python versus a corresponding java String array. If the original java String has any trailing null (zero-value) characters, these will be ignored in python usage. For char arrays, we cannot differentiate between entries whose original value (in java) was 0 or NULL_CHAR.

convertJavaArray(javaArray, convertNulls='ERROR', forPandas=False)¶

Converts a java array to it’s closest numpy.ndarray alternative.

Parameters

javaArray – input java array or dbarray object
convertNulls – member of NULL_CONVERSION enum, specifying how to treat null values. Can be specified by string value (i.e. 'ERROR'), enum member (i.e. NULL_CONVERSION.PASS), or integer value (i.e. 2)
forPandas – boolean indicating whether output will be fed into a pandas.Series, which requires that the underlying data is one-dimensional

Returns

numpy.ndarray representing as faithful a copy of the java array as possible

The value for convertNulls only applies to java integer type (byte, short, int, long) or java.lang.Boolean array types:

NULL_CONVERSION.ERROR (=0) [default] inspect for the presence of null values, and raise an exception if one is encountered.
NULL_CONVERSION.PASS (=1) do not inspect for the presence of null values, and pass value straight through without interpretation (Boolean null -> False). This is intended for conversion which is as fast as possible. No warning is generated if null value(s) present, since no inspection is performed.
NULL_CONVERSION.CONVERT (=2) inspect for the presence of null values, and take steps to return the closest analogous numpy alternative (motivated by pandas behavior):
- integer type columns with null value(s), the numpy.ndarray will have float-point type and null values will be replaced with NaN
- Boolean type columns with null value(s), the numpy.ndarray will have numpy.object type and null values will be None.

Type mapping will be performed as indicated here:

byte -> numpy.int8, or numpy.float32 if necessary for null conversion
short -> numpy.int16, or numpy.float32 if necessary for null conversion
int -> numpy.int32, or numpy.float64 if necessary for null conversion
long -> numpy.int64, or numpy.float64 if necessary for null conversion
Boolean -> numpy.bool, or numpy.object if necessary for null conversion
float -> numpy.float32 and NULL_FLOAT -> numpy.nan
double -> numpy.float64 and NULL_DOUBLE -> numpy.nan
DBDateTime -> numpy.dtype(datetime64[ns]) and null -> numpy.nat
String -> numpy.unicode_ (of appropriate length) and null -> ''
char -> numpy.dtype('U1') (one character string) and NULL_CHAR -> ''
array/DbArray
- if forPandas=False and all entries are of compatible shape, then will return a rectangular numpy.ndarray of dtype in keeping with the above
- if forPandas=False or all entries are not of compatible shape, then returns one-diemnsional numpy.ndarray with dtype numpy.object, with each entry numpy.ndarray and type mapping in keeping with the above
Anything else should present as a one-dimensional array of type numpy.object with entries uninterpreted except by the jpy JNI layer.

Note

The numpy unicode type uses 32-bit characters (there is no 16-bit option), and is implemented as a character array of fixed-length entries, padded as necessary by the null character (i.e. character of integer value 0). Every entry in the array will actually use as many characters as the longest entry, and the numpy fetch of an entry automatically trims the trailing null characters.

This will require much more memory (doubles bit-depth and pads all strings to the length of the longest) in python versus a corresponding java String array. If the original java String has any trailing null (zero-value) characters, these will be ignored in python usage. For char arrays, we cannot differentiate between entries whose original value (in java) was 0 or NULL_CHAR.

convertToJavaArray(input, boxed=False)¶

Convert the input list/tuple or numpy array to a java array. The main utility of the method is likely providing an object required for a specific java function signature. This is a helper method unifying hung on top of makeJavaArray() and ultimately defined via the proper jpy.array call.

A string will be treated as a character array (i.e. java char array). The user has most control over type in providing a numpy.ndarray, otherwise the contents will be converted to a numpy.ndarray first.

numpy.ndarray Type Conversion:

basic numpy primitive dtype are converted to their java analog, NaN values in floating point columns are converted to their respective Deephaven NULL constant values.
dtype datatime64[*] are converted to DBDateTime
dtype of one of the basic string type *(unicode*, str*, bytes*):
- if all elements are one character long: converted to char array
- otherwise, String array
ndarrays of dtype numpy.object:
- ndarrays which are empty or all elements are null are converted to java type Object.
- Otherwise, the first non-null value is used to determine the type for the column.
If the first non-null element is an instance of:
- bool - converted to Boolean array with null values preserved
- str - converted to String array with null values as empty string
- datetime.date or datetime.datetime - the array is converted to DBDateTime
- numpy.ndarray - converted to java array. All elements are assumed null, or ndarray of the same type and compatible shape, or an exception will be raised.
- dict - unsupported
- other iterable type - naive conversion to numpy.ndarray is attempted, then as above.
- any other type:
  convertUnknownToString=True - attempt to naively convert to String array
  
  otherwise, raise exception
ndarrays of any other dtype (namely complex*, uint*, void*, or custom dtypes):
1. convertUnknownToString=True - attempt to convert to column of string type
2. otherwise, raise exception

Parameters

input – string, tuple/list or numpy.ndarray - java type will be inferred
boxed – should we ensure that the constructed array is of boxed type?

Returns

a java array instance

convertToJavaArrayList(input)¶

Convert the input list/tuple or numpy array to a java ArrayList. The main utility of the method is likely providing an object required for a specific java function signature.

Parameters: input – tuple/list or numpy.ndarray - java type will be inferred
Returns: best java.util.ArrayList

Note

The user has most control over type in providing a numpy.ndarray, otherwise the contents will be converted to a numpy.ndarray first. Type mapping will be determined as in convertToJavaArray()

convertToJavaHashMap(input1, input2=None)¶

Create a java hashmap from provided input of the form (input1=keys, input2=values) or (input1=dict).

Parameters

input1 – dict or tuple/list/numpy.ndarray, assumed to be keys
input2 – ignored if input1 is a dict, otherwise tuple/list/numpy.ndarray, assumed to be values

Returns

best java.util.HashMap

Note

The user has most control over type in providing keys and values as numpy.ndarray. Otherwise, the keys and values will be converted to a numpy.ndarray first. Type mapping will be determined as in convertToJavaArray()

convertToJavaHashSet(input)¶

Convert the input list/tuple or numpy array to a java HashSet. The main utility of the method is likely for quick inclusion check inside of a deephaven query.

Parameters: input – tuple/list or numpy.ndarray - java type will be inferred
Returns: best java.util.ArrayList

Note

The user has most control over type in providing a numpy.ndarray, otherwise the contents will be converted to a numpy.ndarray first. Type mapping will be determined as in convertToJavaArray()

convertToJavaList(input)¶

Convert the input list/tuple or numpy array to a (fixed size) java List. The main utility of the method is likely providing an object required for a specific java function signature.

Parameters: input – tuple/list or numpy.ndarray - java type will be inferred
Returns: best java.util.List

Note

The user has most control over type in providing a numpy.ndarray, otherwise the contents will be converted to a numpy.ndarray first. Type mapping will be determined as in makeJavaArray()

createTableFromData(data, columns=None, convertUnknownToString=False)¶

Create a deephaven table object from a collection of column data

Parameters

data – a dict of the form {column_name: column_data} or list of the form [column0_data, column1_data, …]
columns – a list of column names to use
convertUnknownToString – option for whether to attempt to convert unknown elements to a column of string type

Returns

the deephaven table

If data is a dict and columns is given, then only data corresponding to the names in columns is used. If data is a list of column data and columns is not given, then column names will be provided as col_0, col_1, …

Type Conversion:

Columns which are an instance of tuple or list are first naively converted to numpy.ndarray
Columns basic primitive type are converted to their java analog, NaN values in floating point columns are converted to their respective Deephaven NULL constant values.
Columns of underlying type datatime64[*] are converted to DBDateTime
Columns of one of the basic string type (unicode*, str*, bytes*) are converted to String
Columns of type numpy.object - arrays which are empty or all elements are null are converted to java type Object. Otherwise, the first non-null value is used to determine the type for the column.

If the first non-null element is an instance of:
- bool - the array is converted to Boolean with null values preserved
- str - the array is converted to a column of String type
- datetime.date or datetime.datetime - the array is converted to DBDateTime
- numpy.ndarray - all elements are assumed null, or ndarray of the same type and compatible shape, or an exception will be raised.
  if one-dimensional, then column of appropriate DbArray type
  
  otherwise, column of java array type
- dict - unsupported
- other iterable type - naive conversion to numpy.ndarray is attempted, then as above.
- any other type:
  convertUnknownToString=True - attempt to convert to column of string type
  
  otherwise, raise exception
Columns of any other type (namely complex*, uint*, void*, or custom dtypes):
1. convertUnknownToString=True - attempt to convert to column of string type
2. otherwise, raise exception

dataFrameToTable(dataframe, convertUnknownToString=False)¶

Converts the provided pandas.DataFrame object to a deephaven table object.

Parameters

dataframe – pandas.DataFrame object
convertUnknownToString – option for whether to attempt to convert unknown elements to a column of string type

Returns

Table object, which represents dataframe as faithfully as possible

Type Conversion:

Columns basic primitive type are converted to their java analog, NaN values in floating point columns are converted to their respective Deephaven NULL constant values.
Columns of underlying type datatime64[*] are converted to DBDateTime
Columns of one of the basic string type (unicode*, str*, bytes*) are converted to String
Columns of type numpy.object - arrays which are empty or all elements are null are converted to java type Object. Otherwise, the first non-null value is used to determine the type for the column.

If the first non-null element is an instance of:
- bool - the array is converted to Boolean with null values preserved
- str - the array is converted to a column of String type
- datetime.date or datetime.datetime - the array is converted to DBDateTime
- numpy.ndarray - all elements are assumed null, or ndarray of the same type and compatible shape, or an exception will be raised.
  if one-dimensional, then column of appropriate DbArray type
  
  otherwise, column of java array type
- dict - unsupported
- other iterable type - naive conversion to numpy.ndarray is attempted, then as above.
- any other type:
  convertUnknownToString=True - attempt to convert to column of string type
  
  otherwise, raise exception
Columns of any other type (namely complex*, uint*, void*, or custom dtypes):
1. convertUnknownToString=True - attempt to convert to column of string type
2. otherwise, raise exception

doLocked(f, lock_type='exclusive')¶

Executes a function while holding the LiveTableMonitor (LTM) lock. Holding the LTM lock ensures that the contents of a table will not change during a computation, but holding the lock also prevents table updates from happening. The lock should be held for as little time as possible.

Parameters

f – function to execute while holding the LTM lock. f must be callable or have an apply attribute which is callable.
lock_type – LTM lock type. Valid values are “exclusive” and “shared”. “exclusive” allows only a single reader or writer to hold the lock. “shared” allows multiple readers or a single writer to hold the lock.

listen(t, listener, description=None, retain=True, ltype='auto', start_listening=True, replay_initial=False, lock_type='exclusive')¶

Listen to table changes.

Parameters

t – dynamic table to listen to.
listener – listener to process changes. This can either be a function or an object with an “onUpdate” method.
description – description for the UpdatePerformanceTracker to append to the listener’s entry description.
retain – whether a hard reference to this listener should be maintained to prevent it from being collected.
ltype – listener type. Valid values are None, “auto”, “legacy”, and “shift_aware”. None and “auto” (default) use inspection to automatically determine the type of input listener. “legacy” is for a legacy PythonListenerAdapter, which takes three arguments (added, removed, modified). “shift_aware” is for a PythonShiftAwareListenerAdapter, which takes one argument (update).
start_listening – True to create the listener and register the listener with the table. The listener will see updates. False to create the listener, but do not register the listener with the table. The listener will not see updates.
replay_initial – True to replay the initial table contents to the listener. False to only listen to new table changes.
lock_type – LTM lock type. Used when replay_initial=True. See doLocked() for valid values.

Returns

table listener handle.

tableToDataFrame(table, convertNulls=0, categoricals=None)¶

Produces a copy of a table object as a pandas.DataFrame.

Parameters

table – the Table object
convertNulls – member of NULL_CONVERSION enum, specifying how to treat null values.
categoricals – None, column name, or list of column names to convert a ‘categorical’ data series

Returns

pandas.Dataframe object which reproduces table as faithfully as possible

Note that the entire table is going to be cloned into memory, so the total number of entries in the table should be considered before blindly doing this. For large tables (millions of entries or more?), consider measures such as dropping unnecessary columns and/or down-selecting rows using the Deephaven query language before converting.

Warning

The table will be frozen prior to conversion. A table which updates mid-conversion would lead to errors or other undesirable behavior.

The value for convertNulls only applies to java integer type (byte, short, int, long) or java.lang.Boolean array types:

NULL_CONVERSION.ERROR (=0) [default] inspect for the presence of null values, and raise an exception if one is encountered.
NULL_CONVERSION.PASS (=1) do not inspect for the presence of null values, and pass value straight through without interpretation (Boolean null -> False). This is intended for conversion which is as fast as possible. No warning is generated if null value(s) present, since no inspection is performed.
NULL_CONVERSION.CONVERT (=2) inspect for the presence of null values, and take steps to return the closest analogous numpy alternative (motivated by pandas behavior):
- integer type columns with null value(s), the numpy.ndarray will have float-point type and null values will be replaced with NaN
- Boolean type columns with null value(s), the numpy.ndarray will have numpy.object type and null values will be None.

Conversion for different data types will be performed as indicated here:

byte -> numpy.int8, or numpy.float32 if necessary for null conversion
short -> numpy.int16, or numpy.float32 if necessary for null conversion
int -> numpy.int32, or numpy.float64 if necessary for null conversion
long -> numpy.int64, or numpy.float64 if necessary for null conversion
Boolean -> numpy.bool, or numpy.object if necessary for null conversion
float -> numpy.float32 and NULL_FLOAT -> numpy.nan
double -> numpy.float64 and NULL_DOUBLE -> numpy.nan
DBDateTime -> numpy.dtype(datetime64[ns]) and null -> numpy.nat
String -> numpy.unicode_ (of appropriate length) and null -> ''
char -> numpy.dtype('U1') (one character string) and NULL_CHAR -> ''
array/DbArray
- if forPandas=False and all entries are of compatible shape, then will return a rectangular numpy.ndarray of dtype in keeping with the above
- if forPandas=False or all entries are not of compatible shape, then returns one-diemnsional numpy.ndarray with dtype numpy.object, with each entry numpy.ndarray and type mapping in keeping with the above
Anything else should present as a one-dimensional array of type numpy.object with entries uninterpreted except by the jpy JNI layer.

Note

The numpy unicode type uses 32-bit characters (there is no 16-bit option), and is implemented as a character array of fixed-length entries, padded as necessary by the null character (i.e. character of integer value 0). Every entry in the array will actually use as many characters as the longest entry, and the numpy fetch of an entry automatically trims the trailing null characters.

This will require much more memory (doubles bit-depth and pads all strings to the length of the longest) in python versus a corresponding java String array. If the original java String has any trailing null (zero-value) characters, these will be ignored in python usage. For char arrays, we cannot differentiate between entries whose original value (in java) was 0 or NULL_CHAR.

deephaven¶

Deephaven

Navigation

Related Topics