Writing Queries
Overview
Simply speaking, queries are sets of instructions that tell Deephaven what you want to analyze and how to perform the analysis. They form the backbone of all analyses performed in Deephaven, and generate all of the tables and charts used in your workspace.
The following is a simple overview of some of the steps and processes you can use to develop your analysis via queries. Additional detailed information about writing queries is provided in the respective sections.
The basic steps for writing queries are as follows:
- Source and Access the Data
- Reduce the Data Volume
- Grouping, Aggregating and Sorting
- Joining Data from Multiple Tables
- Plotting
Source and Access the Dataset
To perform any analysis, you first need data. Deephaven can make use of live, ticking data; or, the data can be historical. Data used by Deephaven is typically loaded on the server by an administrator in your enterprise. However, you can also load your own data. See: Importing Data.
Once you have located the dataset needed for your analyses, you need to tell Deephaven how to access that data. See: Accessing Data.
Reduce the Data Volume
There can be an enormous volume of information contained in large datasets. As such, it is very beneficial to reduce that volume so you are only using the data needed for your specific analyses. This ensures your analyses are performed faster and use less computing resources. There are two primary methods for doing this.
- Filtering reduces the number of rows in the dataset. For example, a table may contain millions of rows of information about the trades of thousands of different stocks over a period of time. Filtering the table to include only certain stocks, or only trades made during a given time period can significantly reduce the quantity of data being processed, which, in turn, speeds processing time. See: Filtering.
- Selection methods in Deephaven allow you to eliminate (and/or manipulate) columns of data in a table. For example, you may only need to use the data in 10 out of 25 columns in a table. By eliminating the extra columns, you are eliminating the burden of processing those other 15 columns of data. See: Selection.
Grouping, Aggregation and Sorting
Now that you have reduced the volume of the dataset being processed, you can further manipulate the data using Grouping and Aggregation techniques, as well as Sorting. See: Grouping, Aggregation and Sorting.
Joining Data from Multiple Tables
Data needed for a given analysis is often found in multiple datasets. For example, two (or more) tables may contain specific columns of information that are needed to perform an analysis. To combine all of the information into a single table, you need to "join" the data. The Join methods in Deephaven enable you to do just that. See: Joining Data from Multiple Tables.
Plotting
Finally, now that you have your data how you want it, you can create data visualizations through the use of plotting. Many chart options are available, including variations on lines, bars, columns, pies and histograms. See: Plotting.
Learn More
The following items can be used to learn more about writing queries in Deephaven:
- The Quick Start Guide is a very short exercise designed to familiarize new users with the Deephaven interface. It presents actual examples using the Deephaven query language to generate tables and plots, save queries, and build a customized workspace.
- The Deephaven Example Notebooks provide a wide variety of completed queries that can be used as "starters" for your own queries or idea generators.
- Quick Reference Guide: Here you can find quick references to commonly used queries that can be used to build your own.
- Cheat Sheets: The Deephaven Query Cheat Sheet (pdf) and the Deephaven Plotting Cheat Sheet (pdf) can also be used to help you build queries and plots.
Last Updated: 16 February 2021 18:07 -04:00 UTC Deephaven v.1.20200928 (See other versions)
Deephaven Documentation Copyright 2016-2020 Deephaven Data Labs, LLC All Rights Reserved