Ultimate Table Operations Cheat Sheet
Click here to download as a Deephaven Python Notebook
Access Data#
tip
The import
and named tables in these lines are used throughout this page, and should be run first.
Sourcing Tables#
Merging Tables#
Viewing metadata of table#
Printing information#
Filtering#
Most queries benefit by starting with filters. Less data generally means better performance.
For SQL developers : In Deephaven, Joins are not a primary operation for filtering. Use where()
, whereIn()
, and whereNotIn()
.
note
Backticks `
are used for strings and single quotes '
are used for timestamps and characters
Date & Time examples#
Filtering first by partitioning values (which is most often the set of Dates) is both a best practice, and necessary for some downstream use cases.
String Examples#
Number examples#
Multiple Filters#
WhereIn / WhereNotIn#
Nulls and NaNs#
Head and Tail#
Sort#
Select And Create New Columns#
Option 1: Choose and add new columns -- Calculate and write to memory#
Use select()
and update()
when it is expensive to calculate or accessed frequently.
Option 2: Choose and add new columns -- Reference a formula and calc on the fly#
Use view()
and updateView()
when formula is quick or only a portion of the data is used at a time. Minimizes RAM used.
Option 3: Add new columns -- Reference a formula and calc on the fly#
Use lazyUpdate() when there are a small-ish number of unique values; on-demand formula results are stored in cache and re-used.
Using the updateView
#
(or equivalently update
, select
, view
) methods
Getting the row number#
Doing Math#
Handling Arrays#
Calculate Percentile in a series#
Manipulate Time And Calendars#
Bin Data#
Manipulate Strings#
Use Ternaries; If-Thens#
Create And Use A Custom Function#
Manipulate Columns#
Group And Aggregate#
Simple Grouping#
Un-Grouping#
Aggregations#
Join Data From Multiple Tables#
For SQL developers: Joins in Deephaven are used to extend result sets by joining data from other tables, not as much for filtering.
Joins that get used a lot#
Natural Join#
important
The right table of the join needs to have only one match based on the key(s).
Multiple keys#
AJ (As-Of Join)#
As-of joins are the time series joins vital to the capital markets' use cases. It is often wise to make sure the Right-table is sorted (based on the key). aj
is designed to find "the exact match" of the key or "the record just before". For timestamp aj-keys, this means "that time or the record just before.
Less common joins#
Other Useful Methods#
Use columns as arrays and cells as variable#
Read and write csv#
It is very easy to import CSVs via the Code Studio UI and to export any Table to a CSV via the Table Widget (in Code Studios and Dashboard Tables).
CSVs imported from the client need to be done via Code Studio UI.
CSVs imported from a server-side directory should be done via the script below.
Write a Table to a Namespace#
Though the below uses a namespace called ExampleNamespace, it is best practice for teams to establish a naming protocol for these directories.
Do Cum-Sum and Rolling Average#
Another Example of Creating a Rolling Sum#
Setting up an EMA#
Use CumulativeUtil#
Use Numpy And Pandas#
Format Tables#
Datetime Formatting#
Number Formatting#
Color Formatting#
List of canned colors found here: https://docs.deephaven.io/latest/Content/User/misc/colorsAll.htm?Highlight=color
Plot Programmatically#
Substantial documentation about plotting exists. The below intends to show the basics in particular, everything about styling and labeling is omitted from the below.
Refer to:
- https://docs.deephaven.io/latest/Content/writeQueries/plot/overview.htm
- https://docs.deephaven.io/latest/Content/writeQueries/plot/components/index.htm
- https://docs.deephaven.io/latest/Content/writeQueries/plot/visual.htm#Visual_Formatting
note
Run the following first, to create some tables to use in the plot examples below.
Time series plots#
Multiple series on the same axis#
Bar chart#
Plot-by-some key#
Stacked Area#
Area graph#
Business time#
Scatter#
Histogram#
Open-high-low-close#
Error Bar Plot#
One-click plotting#
One-click plotting allows a script writer to designate a key that users can enter into a one-click or linker experience, thereby toggling the plot to filter on the fly.
Users will need to add an "Input Filter" or configure the "Linker" to their UI to manipulate the One-Click plots. See web controls
Another one-click example