pydeephaven¶
- agg
- dherror
- experimental
- filters
ColumnNameColumnName.capitalize()ColumnName.casefold()ColumnName.center()ColumnName.count()ColumnName.encode()ColumnName.endswith()ColumnName.expandtabs()ColumnName.find()ColumnName.format()ColumnName.format_map()ColumnName.index()ColumnName.isalnum()ColumnName.isalpha()ColumnName.isascii()ColumnName.isdecimal()ColumnName.isdigit()ColumnName.isidentifier()ColumnName.islower()ColumnName.isnumeric()ColumnName.isprintable()ColumnName.isspace()ColumnName.istitle()ColumnName.isupper()ColumnName.join()ColumnName.ljust()ColumnName.lower()ColumnName.lstrip()ColumnName.maketrans()ColumnName.partition()ColumnName.removeprefix()ColumnName.removesuffix()ColumnName.replace()ColumnName.rfind()ColumnName.rindex()ColumnName.rjust()ColumnName.rpartition()ColumnName.rsplit()ColumnName.rstrip()ColumnName.split()ColumnName.splitlines()ColumnName.startswith()ColumnName.strip()ColumnName.swapcase()ColumnName.title()ColumnName.translate()ColumnName.upper()ColumnName.zfill()
Filterand_()contains()eq()ge()gt()in_()is_not_null()is_null()le()lt()matches()ne()not_()or_()
- query
QueryQuery.agg_all_by()Query.agg_by()Query.aj()Query.avg_by()Query.count_by()Query.drop_columns()Query.exact_join()Query.exec()Query.first_by()Query.group_by()Query.head()Query.head_by()Query.join()Query.last_by()Query.lazy_update()Query.max_by()Query.median_by()Query.min_by()Query.natural_join()Query.raj()Query.select()Query.select_distinct()Query.slice()Query.snapshot()Query.snapshot_when()Query.sort()Query.std_by()Query.sum_by()Query.tail()Query.tail_by()Query.ungroup()Query.update()Query.update_by()Query.update_view()Query.var_by()Query.view()Query.where()Query.where_in()Query.where_not_in()
- session
SessionSession.tablesSession.is_aliveSession.bind_table()Session.close()Session.empty_table()Session.fetch()Session.fetch_table()Session.hostSession.import_table()Session.input_table()Session.is_aliveSession.merge_tables()Session.open_table()Session.plugin_client()Session.portSession.publish()Session.publish_table()Session.query()Session.release()Session.run_script()Session.time_table()
random()
- table
InputTableInputTable.add()InputTable.agg_all_by()InputTable.agg_by()InputTable.aj()InputTable.avg_by()InputTable.close()InputTable.count_by()InputTable.delete()InputTable.drop_columns()InputTable.exact_join()InputTable.first_by()InputTable.group_by()InputTable.head()InputTable.head_by()InputTable.is_closedInputTable.is_refreshingInputTable.join()InputTable.last_by()InputTable.lazy_update()InputTable.max_by()InputTable.median_by()InputTable.meta_tableInputTable.min_by()InputTable.natural_join()InputTable.pb_ticketInputTable.pb_typed_ticketInputTable.raj()InputTable.select()InputTable.select_distinct()InputTable.slice()InputTable.snapshot()InputTable.snapshot_when()InputTable.sort()InputTable.sort_descending()InputTable.std_by()InputTable.sum_by()InputTable.tail()InputTable.tail_by()InputTable.ticketInputTable.to_arrow()InputTable.typeInputTable.ungroup()InputTable.update()InputTable.update_by()InputTable.update_view()InputTable.var_by()InputTable.view()InputTable.where()InputTable.where_in()InputTable.where_not_in()
MultiJoinInputMultiJoinTableTableTable.is_closedTable.agg_all_by()Table.agg_by()Table.aj()Table.avg_by()Table.close()Table.count_by()Table.drop_columns()Table.exact_join()Table.first_by()Table.group_by()Table.head()Table.head_by()Table.is_closedTable.is_refreshingTable.join()Table.last_by()Table.lazy_update()Table.max_by()Table.median_by()Table.meta_tableTable.min_by()Table.natural_join()Table.pb_ticketTable.pb_typed_ticketTable.raj()Table.select()Table.select_distinct()Table.slice()Table.snapshot()Table.snapshot_when()Table.sort()Table.sort_descending()Table.std_by()Table.sum_by()Table.tail()Table.tail_by()Table.ticketTable.to_arrow()Table.typeTable.ungroup()Table.update()Table.update_by()Table.update_view()Table.var_by()Table.view()Table.where()Table.where_in()Table.where_not_in()
multi_join()
- ticket
- updateby
BadDataBehaviorDeltaControlMathContextOperationControlUpdateByOperationcum_count_where()cum_max()cum_min()cum_prod()cum_sum()delta()ema_tick()ema_time()emmax_tick()emmax_time()emmin_tick()emmin_time()ems_tick()ems_time()emstd_tick()emstd_time()forward_fill()rolling_avg_tick()rolling_avg_time()rolling_count_tick()rolling_count_time()rolling_count_where_tick()rolling_count_where_time()rolling_formula_tick()rolling_formula_time()rolling_group_tick()rolling_group_time()rolling_max_tick()rolling_max_time()rolling_min_tick()rolling_min_time()rolling_prod_tick()rolling_prod_time()rolling_std_tick()rolling_std_time()rolling_sum_tick()rolling_sum_time()rolling_wavg_tick()rolling_wavg_time()
- utils
Deephaven Python Client (pydeephaven) is a Python API built on top of Deephaven’s highly efficient Open API which is based on gRPC and Apache Arrow. It allows Python applications to remotely connect to Deephaven data servers, export/import data with the server, run Python scripts on the server, and execute powerful queries on data tables.
Because Deephaven data servers and Deephaven clients including pydeephaven exchange data in the Apache Arrow format, pydeephaven is able to leverage ‘pyarrow’ - the Python bindings of Arrow (https://arrow.apache.org/docs/python/) for data representation and integration with other data analytic tools such as NumPy, Pandas, etc.
Examples
>>> from pydeephaven import Session
>>> from pyarrow import csv
>>> session = Session() # assuming Deephaven Community Edition is running locally with the default configuration
>>> table1 = session.import_table(csv.read_csv("data1.csv"))
>>> table2 = session.import_table(csv.read_csv("data2.csv"))
>>> joined_table = table1.join(table2, on=["key_col_1", "key_col_2"], joins=["data_col1"])
>>> df = joined_table.to_arrow().to_pandas()
>>> print(df)
>>> session.close()
- exception DHError(message='')[source]¶
Bases:
ExceptionA custom exception class used by pydeephaven.
- add_note()¶
Exception.add_note(note) – add a note to the exception
- with_traceback()¶
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- class NaturalJoinType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]¶
Bases:
EnumAn Enum defining ways to handle duplicate right hand table values during natural join operations
- ERROR_ON_DUPLICATE = 1¶
Throw an error if a duplicate right hand table row is found. This is the default behavior if not specified
- EXACTLY_ONE_MATCH = 4¶
Match exactly one right hand table row; throw an error if there are zero or more than one matches
- FIRST_MATCH = 2¶
Match the first right hand table row and ignore later duplicates
- LAST_MATCH = 3¶
Match the last right hand table row and ignore earlier duplicates
- class Query(session, table)[source]¶
Bases:
TableInterface[Query]A Query object is used to define and execute a sequence of Deephaven table operations on the server.
When the query is executed, the table operations specified for the Query object are batched together and sent to the server in a single request, thus avoiding multiple round trips between the client and the server. The result of executing the query is a new Deephaven table.
Note, an application should always use the factory method on the Session object to create a Query instance as the constructor is subject to future changes to support more advanced features already planned.
- agg_all_by(agg, by)[source]¶
Adds an AggregateAll operation to the query.
- Parameters:
agg (Aggregation) – the aggregation to be applied
by (Union[str, list[str]]) – the group-by column name(s)
- Return type:
- Returns:
self
- agg_by(aggs, by)[source]¶
Adds an Aggregate operation to the query.
- Parameters:
by (Union[str, list[str]]) – the group-by column name(s)
aggs (Union[Aggregation, list[Aggregation]]) – the aggregation(s) to be applied
- Return type:
- Returns:
self
- aj(table, on, joins=None)[source]¶
Adds a as-of join operation to the query.
- Parameters:
table (Table) – the right-table of the join
on (Union[str, list[str]]) – the column(s) to match, can be a common name or a match condition of two columns, e.g. ‘col_a = col_b’. The first ‘N-1’ matches are exact matches. The final match is an inexact match. The inexact match can use either ‘>’ or ‘>=’. If a common name is used for the inexact match, ‘>=’ is used for the comparison.
joins (Optional[Union[str, list[str]]]) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None
- Return type:
- Returns:
self
- avg_by(by=None)[source]¶
Adds an avg-by aggregation to the query.
- Parameters:
by (Optional[Union[str, list[str]]]) – the group-by column name(s), default is None
- Return type:
- Returns:
self
- count_by(col, by=None)[source]¶
Adds a count-by aggregation to the query.
- Parameters:
col (str) – the name of the column to store the counts
by (Optional[Union[str, list[str]]]) – the group-by column name(s), default is None
- Return type:
- Returns:
self
- drop_columns(cols)[source]¶
Adds a drop-columns operation to the query.
- Parameters:
cols (Union[str, list[str]]) – the column name(s)
- Return type:
- Returns:
self
- exact_join(table, on, joins=None)[source]¶
Adds an exact-join operation to the query.
- Parameters:
table (Table) – the right-table of the join
on (Union[str, list[str]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names
joins (Optional[Union[str, list[str]]]) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None
- Return type:
- Returns:
self
- first_by(by=None)[source]¶
Adds a first-by aggregation to the query.
- Parameters:
by (Optional[Union[str, list[str]]]) – the group-by column name(s), default is None
- Return type:
- Returns:
self
- group_by(by=None)[source]¶
Adds a group-by aggregation to the query.
- Parameters:
by (Optional[Union[str, list[str]]]) – the group-by column name(s); default is None
- Return type:
- Returns:
self
- head(num_rows)[source]¶
Adds a head operation to the query.
- Parameters:
num_rows (int) – the number of rows at the head of table
- Return type:
- Returns:
self
- head_by(num_rows, by)[source]¶
Adds a head-by operation to the query.
- Parameters:
num_rows (int) – the number of rows at the beginning of each group
by (Union[str, list[str]]) – the group-by column name(s)
- Return type:
- Returns:
self
- join(table, on=None, joins=None, reserve_bits=10)[source]¶
Adds a cross-join operation to the query.
- Parameters:
table (Table) – the right-table of the join
on (Optional[Union[str, list[str]]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names
joins (Optional[Union[str, list[str]]]) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None
reserve_bits (int) – the number of bits of key-space to initially reserve per group; default is 10
- Return type:
- Returns:
self
- last_by(by=None)[source]¶
Adds a last-by aggregation to the query.
- Parameters:
by (Optional[Union[str, list[str]]]) – the group-by column name(s), default is None
- Return type:
- Returns:
self
- lazy_update(formulas)[source]¶
Adds a lazy-update operation to the query.
- Parameters:
formulas (Union[str, list[str]]) – the column formula(s)
- Return type:
- Returns:
self
- max_by(by=None)[source]¶
Adds a max-by aggregation to the query.
- Parameters:
by (Optional[Union[str, list[str]]]) – the group-by column name(s), default is None
- Return type:
- Returns:
self
- median_by(by=None)[source]¶
Adds a median-by aggregation to the query.
- Parameters:
by (Optional[Union[str, list[str]]]) – the group-by column name(s), default is None
- Return type:
- Returns:
self
- min_by(by=None)[source]¶
Adds a min-by aggregation to the query.
- Parameters:
by (Optional[Union[str, list[str]]]) – the group-by column name(s), default is None
- Return type:
- Returns:
self
- natural_join(table, on, joins=None, type=NaturalJoinType.ERROR_ON_DUPLICATE)[source]¶
Adds a natural-join operation to the query.
- Parameters:
table (Table) – the right-table of the join
on (Union[str, list[str]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names
joins (Optional[Union[str, list[str]]]) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None
type (NaturalJoinType) – the action to be taken when duplicate right hand rows are encountered; default is ERROR_ON_DUPLICATE
- Return type:
- Returns:
self
- raj(table, on, joins=None)[source]¶
Adds a reverse as-of join operation to the query.
- Parameters:
table (Table) – the right-table of the join
on (Union[str, list[str]]) – the column(s) to match, can be a common name or a match condition of two columns, e.g. ‘col_a = col_b’. The first ‘N-1’ matches are exact matches. The final match is an inexact match. The inexact match can use either ‘<’ or ‘<=’. If a common name is used for the inexact match, ‘<=’ is used for the comparison.
joins (Optional[Union[str, list[str]]]) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None
- Return type:
- Returns:
self
- select(formulas=None)[source]¶
Adds a select operation to the query.
- Parameters:
formulas (Optional[Union[str, list[str]]]) – the column formula(s), default is None
- Return type:
- Returns:
self
- select_distinct(cols=None)[source]¶
Adds a select-distinct operation to the query.
- Parameters:
cols (Optional[Union[str, list[str]]]) – the column name(s), default is None
- Return type:
- Returns:
self
- slice(start, stop)¶
Extracts a subset of a table by row positions into a new Table.
If both the start and the stop are positive, then both are counted from the beginning of the table. The start is inclusive, and the stop is exclusive. slice(0, N) is equivalent to
head()(N) The start must be less than or equal to the stop.If the start is positive and the stop is negative, then the start is counted from the beginning of the table, inclusively. The stop is counted from the end of the table. For example, slice(1, -1) includes all rows but the first and last. If the stop is before the start, the result is an empty table.
If the start is negative, and the stop is zero, then the start is counted from the end of the table, and the end of the slice is the size of the table. slice(-N, 0) is equivalent to
tail()(N).If the start is negative and the stop is negative, they are both counted from the end of the table. For example, slice(-2, -1) returns the second to last row of the table.
- Parameters:
start (int) – the first row position to include in the result
stop (int) – the last row position to include in the result
- Return type:
TypeVar(T)- Returns:
a new Table
- Raises:
DHError –
Examples
>>> table.slice(0, 5) # first 5 rows >>> table.slice(-5, 0) # last 5 rows >>> table.slice(2, 6) # rows from index 2 to 5 >>> table.slice(6, 2) # ERROR: cannot slice start after end >>> table.slice(-6, -2) # rows from 6th last to 2nd last (exclusive) >>> table.slice(-2, -6) # ERROR: cannot slice start after end >>> table.slice(2, -3) # all rows except the first 2 and the last 3 >>> table.slice(-6, 8) # rows from 6th last to index 8 (exclusive)
- snapshot_when(trigger_table, stamp_cols=None, initial=False, incremental=False, history=False)[source]¶
Adds a snapshot_when operation to the query.
- Parameters:
trigger_table (Table) – the trigger table
stamp_cols (Optional[Union[str, list[str]]]) – The column(s) from trigger_table that form the “stamp key”, may be renames, default is None, meaning that all columns from trigger_table form the “stamp key”.
initial (bool) – Whether to take an initial snapshot upon construction, default is False. When False, the resulting table will remain empty until trigger_table first updates.
incremental (bool) – Whether the resulting table should be incremental, default is False. When False, all rows of this table will have the latest “stamp key”. When True, only the rows of this table that have been added or updated will have the latest “stamp key”.
history (bool) – Whether the resulting table should keep history, default is False. A history table appends a full snapshot of this table and the “stamp key” as opposed to updating existing rows. The history flag is currently incompatible with initial and incremental: when history is True, incremental and initial must be False.
- Return type:
- Returns:
self
- sort(order_by, order=None)[source]¶
Adds sort operation to the query.
- Parameters:
order_by (Union[str, list[str]]) – the names of the columns to be sorted on
order (Optional[Union[SortDirection, list[SortDirection]]]) – the corresponding sort direction(s) for each sort column, default is None. In the absence of explicit sort directions, data will be sorted in the ascending order.
- Return type:
- Returns:
self
- std_by(by=None)[source]¶
Adds a std-by aggregation to the query.
- Parameters:
by (Optional[Union[str, list[str]]]) – the group-by column name(s), default is None
- Return type:
- Returns:
self
- sum_by(by=None)[source]¶
Adds a sum-by aggregation to the query.
- Parameters:
by (Optional[Union[str, list[str]]]) – the group-by column name(s), default is None
- Return type:
- Returns:
self
- tail(num_rows)[source]¶
Adds a tail operation to the query.
- Parameters:
num_rows (int) – the number of rows at the end of table
- Return type:
- Returns:
self
- tail_by(num_rows, by)[source]¶
Adds a tail-by operation to the query.
- Parameters:
num_rows (int) – the number of rows at the end of each group
by (Union[str, list[str]]) – the group-by column name(s)
- Return type:
- Returns:
self
- ungroup(cols=None, null_fill=True)[source]¶
Adds an ungroup operation to the query.
- Parameters:
cols (Optional[Union[str, list[str]]]) – the array column(s), default is None, meaning all array columns will be ungrouped
null_fill (bool) – indicates whether null should be used to fill missing cells, default is True
- Return type:
- Returns:
self
- update(formulas)[source]¶
Adds an update operation to the query.
- Parameters:
formulas (Union[str, list[str]]) – the column formula(s)
- Return type:
- Returns:
self
- update_by(ops, by=None)[source]¶
Adds an update-by operation to the query.
- Parameters:
ops (Union[UpdateByOperation, list[UpdateByOperation]]) – the UpdateByOperation(s) to be applied
by (Optional[Union[str, list[str]]]) – the group-by column name(s), default is None
- Return type:
- Returns:
self
- update_view(formulas)[source]¶
Adds an update-view operation to the query.
- Parameters:
formulas (Union[str, list[str]]) – the column formula(s)
- Return type:
- Returns:
self
- var_by(by=None)[source]¶
Adds a var-by aggregation to the query.
- Parameters:
by (Optional[Union[str, list[str]]]) – the group-by column name(s), default is None
- Return type:
- Returns:
self
- view(formulas)[source]¶
Adds a view operation to the query.
- Parameters:
formulas (Union[str, list[str]]) – the column formula(s)
- Return type:
- Returns:
self
- class Session(host=None, port=None, auth_type='Anonymous', auth_token='', never_timeout=True, session_type='python', use_tls=False, tls_root_certs=None, client_cert_chain=None, client_private_key=None, client_opts=None, extra_headers=None)[source]¶
Bases:
objectA Session represents a connection to the Deephaven data server. It contains a number of convenience methods for asking the server to create tables, import Arrow data into tables, merge tables, run Python scripts, and execute queries.
Session objects can be used in Python with statement so that whatever happens in the with statement block, they are guaranteed to be closed upon exit.
- tables¶
names of the global tables available in the server after running scripts
- Type:
list[str]
- is_alive¶
check if the session is still alive (may refresh the session)
- Type:
bool
Initializes a Session object that connects to the Deephaven server
- Parameters:
host (Optional[str]) – the host name or IP address of the remote machine, if None, ‘localhost’ is used.
port (Optional[int]) – the port number that Deephaven server is listening on, if None, 10000 is used.
auth_type (str) – the authentication type string, can be “Anonymous’, ‘Basic”, or any custom-built authenticator in the server, such as “io.deephaven.authentication.psk.PskAuthenticationHandler”, default is ‘Anonymous’.
auth_token (str) – the authentication token string. When auth_type is ‘Basic’, it must be “user:password”; when auth_type is “Anonymous’, it will be ignored; when auth_type is a custom-built authenticator, it must conform to the specific requirement of the authenticator
never_timeout (bool) – never allow the session to timeout, default is True
session_type (str) – the Deephaven session type. Defaults to ‘python’
use_tls (bool) – if True, use a TLS connection. Defaults to False
tls_root_certs (Optional[bytes]) – PEM encoded root certificates to use for TLS connection, or None to use system defaults. If not None implies use a TLS connection and the use_tls argument should have been passed as True. Defaults to None
client_cert_chain (Optional[bytes]) – PEM encoded client certificate if using mutual TLS. Defaults to None, which implies not using mutual TLS.
client_private_key (Optional[bytes]) – PEM encoded client private key for client_cert_chain if using mutual TLS. Defaults to None, which implies not using mutual TLS.
client_opts (Optional[list[Tuple[str,Union[int,str]]]) –
list of tuples for name and value of options to the underlying grpc channel creation. Defaults to None, which implies not using any channel options. See https://grpc.github.io/grpc/cpp/group__grpc__arg__keys.html for a list of valid options. Example options:
- [ (‘grpc.target_name_override’, ‘idonthaveadnsforthishost’),
(‘grpc.min_reconnect_backoff_ms’, 2000) ]
extra_headers (Optional[dict[bytes, bytes]]) – additional headers (and values) to add to server requests. Defaults to None, which implies not using any extra headers.
- Raises:
DHError –
- bind_table(name, table)[source]¶
Binds a table to the given name on the server so that it can be referenced by that name.
- close()[source]¶
Closes the Session object if it hasn’t timed out already.
- Raises:
DHError –
- Return type:
None
- fetch(ticket)[source]¶
Fetches a server object by ticket.
This is low-level method that can be used to fetch non-Table server objects. The ticket represents a fetchable server object, e.g
PluginClient,Fetchable. This method is used together with thepublish()method to share server objects between sessions.
- fetch_table(ticket)[source]¶
Fetches a table by ticket.
- Parameters:
ticket (SharedTicket) – a ticket
- Return type:
- Returns:
a Table object
- Raises:
DHError –
- property host¶
The host name of the server.
- import_table(data)[source]¶
Imports the pyarrow table as a new Deephaven table on the server.
Deephaven supports most of the Arrow data types. However, if the pyarrow table contains any field with a data type not supported by Deephaven, the import operation will fail.
- input_table(schema=None, init_table=None, key_cols=None, blink_table=False)[source]¶
Creates an InputTable from either Arrow schema or initial table. When blink_table is True, the InputTable will be a blink table. When blink_table is False (default), the InputTable will be keyed if key columns are provided, otherwise it will be append-only.
- Parameters:
schema (Optional[pa.Schema]) – the schema for the InputTable
init_table (Optional[Table]) – the initial table
key_cols (Optional[Union[str, list[str]]]) – the name(s) of the key column(s)
blink_table (bool) – whether the InputTable should be a blink table, default is False
- Return type:
- Returns:
an InputTable
- Raises:
DHError, ValueError –
- property is_alive¶
Whether the session is alive.
- merge_tables(tables, order_by='')[source]¶
Merges several tables into one table on the server.
- Parameters:
tables (list[Table]) – the list of Table objects to merge
order_by (str) – the order by column name, if specified, the input tables must already be sorted on this column, and the result table will be sorted on it as well, default is “”
- Return type:
- Returns:
a Table object
- Raises:
DHError –
- plugin_client(server_obj)[source]¶
Wraps a ticket as a PluginClient. Capabilities here vary based on the server implementation of the ObjectType, but most will at least send a response payload to the client, possibly including references to other objects. In some cases, depending on the server implementation, the client will also be able to send the same sort of messages back to the server.
Part of the experimental plugin API.
- Return type:
- property port¶
The port number of the server.
- publish(source_ticket, result_ticket)[source]¶
Publishes a source ticket to the result ticket.
This is low-level method that can be used to publish non-Table server objects that are previously fetched from the server. The source ticket represents the previously fetched server object to be published, and the result ticket, which should normally be a SharedTicket, is the ticket to publish to. The result ticket can then be fetched by other sessions to access the object as long as the object is not released. This method is used together with the
fetch()method to share server objects between sessions.
- publish_table(ticket, table)[source]¶
Publishes a table to the given shared ticket. The ticket can then be used by another session to fetch the table.
Note that, the shared ticket can be fetched by other sessions to access the table as long as the table is not released. When the table is released either through an explicit call of the close method on it, or implicitly through garbage collection, or through the closing of the publishing session, the shared ticket will no longer be valid.
- Parameters:
ticket (SharedTicket) – a SharedTicket object
table (Table) – a Table object
- Raises:
DHError –
- Return type:
None
- query(table)[source]¶
Creates a Query object to define a sequence of operations on a Deephaven table.
- release(ticket)[source]¶
Releases a ticket.
- Parameters:
ticket (Ticket) – the ticket to release
- Return type:
None
- run_script(script, systemic=None)[source]¶
Runs the supplied Python script on the server.
- Parameters:
script (str) – the Python script code
systemic (Optional[bool]) – Whether to treat the code as systemically important. Defaults to None which uses the default system behavior
- Raises:
DHError –
- Return type:
None
- time_table(period, start_time=None, blink_table=False)[source]¶
Creates a time table on the server.
- Parameters:
period (Union[int, str]) – the interval at which the time table ticks (adds a row); units are nanoseconds or a time interval string, e.g. “PT00:00:.001” or “PT1S”
start_time (Optional[Union[int, str]]) – the start time for the time table in nanoseconds or as a date time formatted string; default is None (meaning now)
blink_table (bool) – if the time table should be a blink table, defaults to False
- Return type:
- Returns:
a Table object
- Raises:
DHError –
- class SortDirection(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]¶
Bases:
EnumAn enum defining the sort ordering.
- ASCENDING = 1¶
Ascending sort direction
- DESCENDING = -1¶
Descending sort direction
- class Table(session, ticket, schema_header=b'', size=None, is_static=None, schema=None)[source]¶
Bases:
TableInterface[Table],ServerObjectA Table object represents a reference to a table on the server. It is the core data structure of Deephaven and supports a rich set of operations such as filtering, sorting, aggregating, joining, snapshotting etc.
Note, an application should never instantiate a Table object directly. Table objects are always provided through factory methods such as Session.empty_table(), or import/export methods such as Session.import_table(), open_table(), or any of the Table operations.
- is_closed¶
check if the table has been closed on the server
- Type:
bool
- agg_all_by(agg, by)[source]¶
The agg_all_by method creates a new table containing grouping columns and grouped data. The resulting grouped data is defined by the aggregation specified.
Note, because agg_all_by applies the aggregation to all the columns of the table, it will ignore any column names specified for the aggregation.
- Parameters:
agg (Aggregation) – the aggregation to be applied
by (Union[str, list[str]]) – the group-by column name(s)
- Return type:
- Returns:
a Table object
- Raises:
DHError –
- agg_by(aggs, by)[source]¶
The agg_by method creates a new table containing grouping columns and grouped data. The resulting grouped data is defined by the aggregation(s) specified.
- Parameters:
aggs (Union[Aggregation, list[Aggregation]]) – the aggregation(s) to be applied
by (Union[str, list[str]]) – the group-by column name(s)
- Return type:
- Returns:
a Table object
- Raises:
DHError –
- aj(table, on, joins=None)[source]¶
The aj (as-of join) method creates a new table containing all the rows and columns of the left table, plus additional columns containing data from the right table. For columns appended to the left table (joins), row values equal the row values from the right table where the keys from the left table most closely match the keys from the right table without going over. If there is no matching key in the right table, appended row values are NULL.
- Parameters:
table (Table) – the right-table of the join
on (Union[str, list[str]]) – the column(s) to match, can be a common name or a match condition of two columns, e.g. ‘col_a = col_b’. The first ‘N-1’ matches are exact matches. The final match is an inexact match. The inexact match can use either ‘>’ or ‘>=’. If a common name is used for the inexact match, ‘>=’ is used for the comparison.
joins (Optional[Union[str, list[str]]]) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None, which means all the columns from the right table, excluding those specified in ‘on’
- Return type:
- Returns:
a Table object
- Raises:
DHError –
- avg_by(by=None)[source]¶
The avg_by method creates a new table containing the average for each group. Columns not used in the grouping must be of numeric types.
- count_by(col, by=None)[source]¶
The count_by method creates a new table containing the number of rows for each group. The count of each group is stored in a new column named after the ‘col’ parameter.
- drop_columns(cols)[source]¶
The drop_column method creates a new table with the same size as this table but omits any of the specified columns.
- exact_join(table, on, joins=None)[source]¶
The exact_join method creates a new table containing all the rows and columns of this table plus additional columns containing data from the right table. For columns appended to the left table (joins), row values equal the row values from the right table where the key values in the left and right tables are equal.
- Parameters:
table (Table) – the right-table of the join
on (Union[str, list[str]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names
joins (Optional[Union[str, list[str]]]) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None, which means all the columns from the right table, excluding those specified in ‘on’
- Return type:
- Returns:
a Table object
- Raises:
DHError –
- first_by(by=None)[source]¶
The first_by method creates a new table which contains the first row of each distinct group.
- group_by(by=None)[source]¶
The group_by method creates a new table containing grouping columns and grouped data, column content is grouped into arrays.
If no group-by column is given, the content of each column is grouped into its own array.
- head(num_rows)[source]¶
The head method creates a new table with a specific number of rows from the beginning of the table.
- head_by(num_rows, by)[source]¶
The head_by method creates a new table containing the first number of rows for each group.
- property is_closed¶
Whether this table is closed.
- property is_refreshing¶
Whether this table is refreshing.
- join(table, on=None, joins=None, reserve_bits=10)[source]¶
The join method creates a new table containing rows that have matching values in both tables. Rows that do not have matching criteria will not be included in the result. If there are multiple matches between a row from the left table and rows from the right table, all matching combinations will be included. If no columns to match (on) are specified, every combination of left and right table rows is included.
- Parameters:
table (Table) – the right-table of the join
on (Optional[Union[str, list[str]]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names, default is None
joins (Optional[Union[str, list[str]]]) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None, which means all the columns from the right table, excluding those specified in ‘on’
reserve_bits (int) – the number of bits of key-space to initially reserve per group; default is 10
- Return type:
- Returns:
a Table object
- Raises:
DHError –
- last_by(by=None)[source]¶
The last_by method creates a new table which contains the last row of each distinct group.
- lazy_update(formulas)[source]¶
The lazy_update method creates a new table containing a new, cached, formula column for each formula.
- max_by(by=None)[source]¶
The max_by method creates a new table containing the maximum value for each group. Columns not used in the grouping must be of numeric types.
- median_by(by=None)[source]¶
The median_by method creates a new table containing the median for each group. Columns not used in the grouping must be of numeric types.
- property meta_table¶
The column definitions of the table in a Table form.
- min_by(by=None)[source]¶
The min_by method creates a new table containing the minimum value for each group. Columns not used in the grouping must be of numeric types.
- natural_join(table, on, joins=None, type=NaturalJoinType.ERROR_ON_DUPLICATE)[source]¶
The natural_join method creates a new table containing all the rows and columns of this table, plus additional columns containing data from the right table. For columns appended to the left table (joins), row values equal the row values from the right table where the key values in the left and right tables are equal. If there is no matching key in the right table, appended row values are NULL.
- Parameters:
table (Table) – the right-table of the join
on (Union[str, list[str]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names
joins (Optional[Union[str, list[str]]]) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None, which means all the columns from the right table, excluding those specified in ‘on’
type (NaturalJoinType) – the action to be taken when duplicate right hand rows are encountered; default is ERROR_ON_DUPLICATE
- Return type:
- Returns:
a Table object
- Raises:
DHError –
- property pb_ticket¶
Returns the ticket as a gRPC protobuf ticket object.
- property pb_typed_ticket¶
Returns a protobuf typed ticket, suitable for use in communicating with an ObjectType plugin on the server.
- raj(table, on, joins=None)[source]¶
The raj (reverse as-of join) method creates a new table containing all the rows and columns of the left table, plus additional columns containing data from the right table. For columns appended to the left table ( joins), row values equal the row values from the right table where the keys from the left table most closely match the keys from the right table without going under. If there is no matching key in the right table, appended row values are NULL.
- Parameters:
table (Table) – the right-table of the join
on (Union[str, list[str]]) – the column(s) to match, can be a common name or a match condition of two columns, e.g. ‘col_a = col_b’. The first ‘N-1’ matches are exact matches. The final match is an inexact match. The inexact match can use either ‘<’ or ‘<=’. If a common name is used for the inexact match, ‘<=’ is used for the comparison.
joins (Optional[Union[str, list[str]]]) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None, which means all the columns from the right table, excluding those specified in ‘on’
- Return type:
- Returns:
a Table object
- Raises:
DHError –
- select(formulas=None)[source]¶
The select method creates a new in-memory table that includes one column for each formula. If no formula is specified, all columns will be included.
- select_distinct(cols=None)[source]¶
The select_distinct method creates a new table containing all the unique values for a set of key columns. When the selectDistinct method is used on multiple columns, it looks for distinct sets of values in the selected columns.
- slice(start, stop)[source]¶
Extracts a subset of a table by row positions into a new Table.
If both the start and the stop are positive, then both are counted from the beginning of the table. The start is inclusive, and the stop is exclusive. slice(0, N) is equivalent to
head()(N) The start must be less than or equal to the stop.If the start is positive and the stop is negative, then the start is counted from the beginning of the table, inclusively. The stop is counted from the end of the table. For example, slice(1, -1) includes all rows but the first and last. If the stop is before the start, the result is an empty table.
If the start is negative, and the stop is zero, then the start is counted from the end of the table, and the end of the slice is the size of the table. slice(-N, 0) is equivalent to
tail()(N).If the start is negative and the stop is negative, they are both counted from the end of the table. For example, slice(-2, -1) returns the second to last row of the table.
- Parameters:
start (int) – the first row position to include in the result
stop (int) – the last row position to include in the result
- Return type:
- Returns:
a new Table
- Raises:
DHError –
Examples
>>> table.slice(0, 5) # first 5 rows >>> table.slice(-5, 0) # last 5 rows >>> table.slice(2, 6) # rows from index 2 to 5 >>> table.slice(6, 2) # ERROR: cannot slice start after end >>> table.slice(-6, -2) # rows from 6th last to 2nd last (exclusive) >>> table.slice(-2, -6) # ERROR: cannot slice start after end >>> table.slice(2, -3) # all rows except the first 2 and the last 3 >>> table.slice(-6, 8) # rows from 6th last to index 8 (exclusive)
- snapshot_when(trigger_table, stamp_cols=None, initial=False, incremental=False, history=False)[source]¶
The snapshot_when creates a table that captures a snapshot of this table whenever trigger_table updates.
When trigger_table updates, a snapshot of this table and the “stamp key” from trigger_table form the resulting table. The “stamp key” is the last row of the trigger_table, limited by the stamp_cols. If trigger_table is empty, the “stamp key” will be represented by NULL values.
- Parameters:
trigger_table (Table) – the trigger table
stamp_cols (Optional[Union[str, list[str]]]) – The column(s) from trigger_table that form the “stamp key”, may be renames, default is None, meaning that all columns from trigger_table form the “stamp key”.
initial (bool) – Whether to take an initial snapshot upon construction, default is False. When False, the resulting table will remain empty until trigger_table first updates.
incremental (bool) – Whether the resulting table should be incremental, default is False. When False, all rows of this table will have the latest “stamp key”. When True, only the rows of this table that have been added or updated will have the latest “stamp key”.
history (bool) – Whether the resulting table should keep history, default is False. A history table appends a full snapshot of this table and the “stamp key” as opposed to updating existing rows. The history flag is currently incompatible with initial and incremental: when history is True, incremental and initial must be False.
- Return type:
- Returns:
a Table object
- Raises:
DHError –
- sort(order_by, order=None)[source]¶
The sort method creates a new table where the rows are ordered based on values in the specified set of columns.
- Parameters:
order_by (Union[str, list[str]]) – the column(s) to be sorted on
order (Optional[Union[SortDirection, list[SortDirection]]]) – the corresponding sort direction(s) for each sort column, default is None. In the absence of explicit sort directions, data will be sorted in the ascending order.
- Return type:
- Returns:
a Table object
- Raises:
DHError –
- sort_descending(order_by)[source]¶
The sort_descending method creates a new table where rows in a table are sorted in descending order based on the order_by column(s).
- std_by(by=None)[source]¶
The std_by method creates a new table containing the sample standard deviation for each group. Columns not used in the grouping must be of numeric types.
Sample standard deviation is computed using Bessel’s correction, which ensures that the sample variance will be an unbiased estimator of population variance.
- sum_by(by=None)[source]¶
The sum_by method creates a new table containing the sum for each group. Columns not used in the grouping must be of numeric types.
- tail(num_rows)[source]¶
The tail method creates a new table with a specific number of rows from the end of the table.
- tail_by(num_rows, by)[source]¶
The tail_by method creates a new table containing the last number of rows for each group.
- ticket¶
The ticket that points to the object on the server.
- to_arrow()[source]¶
Takes a snapshot of the table and returns a pyarrow Table.
- Return type:
Table- Returns:
a pyarrow.Table
- Raises:
DHError –
- type¶
The type of the object. May be empty, indicating that the instance cannot be connected to or otherwise directly used from the client.
- ungroup(cols=None, null_fill=True)[source]¶
The ungroup method creates a new table in which array columns from the source table are unwrapped into separate rows. The ungroup columns should be of array types.
- Parameters:
cols (Optional[Union[str, list[str]]]) – the array column(s), default is None, meaning all array columns will be ungrouped, default is None, meaning all array columns will be ungrouped
null_fill (bool) – indicates whether null should be used to fill missing cells, default is True
- Return type:
- Returns:
a Table object
- Raises:
DHError –
- update(formulas)[source]¶
The update method creates a new table containing a new, in-memory column for each formula.
- update_by(ops, by=None)[source]¶
The update_by method creates a table with additional columns calculated from window-based aggregations of columns in this table. The aggregations are defined by the provided operations, which support incremental aggregations over the corresponding rows in the table. The aggregations will apply position or time-based windowing and compute the results over the entire table or each row group as identified by the provided key columns.
- Parameters:
ops (Union[UpdateByOperation, list[UpdateByOperation]]) – the UpdateByOperation(s) to be applied
by (Optional[Union[str, list[str]]]) – the group-by column name(s), defaults to None, meaning all calculations are performed over the entire table
- Return type:
- Returns:
a Table object
- Raises:
DHError –
- update_view(formulas)[source]¶
The update_view method creates a new table containing a new, formula column for each formula.
- var_by(by=None)[source]¶
The var_by method creates a new table containing the sample variance for each group. Columns not used in the grouping must be of numeric types.
Sample variance is computed using Bessel’s correction, which ensures that the sample variance will be an unbiased estimator of population variance.
- view(formulas)[source]¶
The view method creates a new formula table that includes one column for each formula.
- where(filters)[source]¶
The where method creates a new table with only the rows meeting the filter criteria in the column(s) of the table.
- where_in(filter_table, cols)[source]¶
The where_in method creates a new table containing rows from the source table, where the rows match values in the filter table.