Writing Queries

Overview

Simply speaking, queries are sets of instructions that tell Deephaven what you want to analyze and how to perform the analysis. They form the backbone of all analyses performed in Deephaven, and generate all of the tables and charts used in your workspace.

The following is a simple overview of some of the steps and processes you can use to develop your analysis via queries. Additional detailed information about writing queries is provided in the respective sections.

The basic steps for writing queries are as follows:

  • Source and Access the Data
  • Reduce the Data Volume
  • Grouping, Aggregating and Sorting
  • Joining Data from Multiple Tables
  • Plotting

Source and Access the Dataset

To perform any analysis, you first need data. Deephaven can make use of live, ticking data; or, the data can be historical. Data used by Deephaven is typically loaded on the server by an administrator in your enterprise. However, you can also load your own data. See: Importing Data.

Once you have located the dataset needed for your analyses, you need to tell Deephaven how to access that data. See: Accessing Data.

Reduce the Data Volume

There can be an enormous volume of information contained in large datasets. As such, it is very beneficial to reduce that volume so you are only using the data needed for your specific analyses. This ensures your analyses are performed faster and use less computing resources. There are two primary methods for doing this.

  • Filtering reduces the number of rows in the dataset. For example, a table may contain millions of rows of information about the trades of thousands of different stocks over a period of time. Filtering the table to include only certain stocks, or only trades made during a given time period can significantly reduce the quantity of data being processed, which, in turn, speeds processing time. See: Filtering.
  • Selection methods in Deephaven allow you to eliminate (and/or manipulate) columns of data in a table. For example, you may only need to use the data in 10 out of 25 columns in a table. By eliminating the extra columns, you are eliminating the burden of processing those other 15 columns of data. See: Selection.

Grouping, Aggregation and Sorting

Now that you have reduced the volume of the dataset being processed, you can further manipulate the data using Grouping and Aggregation techniques, as well as Sorting. See: Grouping, Aggregation and Sorting.

Joining Data from Multiple Tables

Data needed for a given analysis is often found in multiple datasets. For example, two (or more) tables may contain specific columns of information that are needed to perform an analysis. To combine all of the information into a single table, you need to "join" the data. The Join methods in Deephaven enable you to do just that. See: Joining Data from Multiple Tables.

Plotting

Finally, now that you have your data how you want it, you can create data visualizations through the use of plotting. Many chart options are available, including variations on lines, bars, columns, pies and histograms. See: Plotting.

Learn More

The following items can be used to learn more about writing queries in Deephaven:


Last Updated: 16 February 2021 18:07 -04:00 UTC    Deephaven v.1.20200928  (See other versions)

Deephaven Documentation     Copyright 2016-2020  Deephaven Data Labs, LLC     All Rights Reserved