ErrorBar Plots
Error bar plotting is available for XY series plots and category plots. Error bar plots, which visually indicate the uncertainties in a dataset, are useful in determining whether the variability in your data is statistically significant. For instance, if your plot shows average values, a short error bar shows that the average value is more certain because the values are concentrated, while a long error bar shows a greater amount of uncertainty because the values are spread out, and thus less reliable. The error bars typically represent standard deviation or a particular confidence interval.
XY Series Plots with Error Bars
There are three methods available to plot an XY series plots with error bars:
errorBarX()
is used to create an XY series plot with error bars drawn in the same direction as the X axis (horizontally).errorBarY()
is used to create an XY series plot with error bars drawn in the same direction as the Y axis (vertically).errorBarXY()
is used to create an XY series plot with error bars drawn both horizontal and vertically.
Data Sourcing
XY Series Plots with Error Bars can be created from data sourced from a table or an array. Please refer to each of the following options for further details and syntax.
errorBarX()
When creating an errorBarX plot using data sourced from a table, the following syntax can be used:
errorBarX("SeriesName", source, "x", "y", "xLow", "xHigh")
errorBarX
is the method used to create an errorBarX plot.- "
SeriesName
" is the name (as a string) you want to use to identify the series on the plot itself. source
is the table that holds the data to be used for the plot.- "
x
" is the name of the column of data to be used for the X value. - "
y
" is the name of the column of data to be used for the Y value. - "
xLow
" is the name of the column of data to be used for the low error value on the X axis. - "
xHigh
" is the name of the column of data to be used for the high error value on the X axis.
When creating an errorBarX plot using data sourced from an array, the following syntax can be used:
errorBarX("SeriesName", [x], [y], [xLow], [xHigh])
errorBarX
is the method used to create an errorBarX chart.- "
SeriesName
" is the name (as a string) you want to use to identify the series on the chart itself. - "
[x]
" is the array containing the data to be used for the X value. - "
[y]
" is the array containing the data to be used for the Y value. - "
[xLow]
" is the array containing the data to be used for the low error value on the X axis. - "
[xHigh]
" is the array containing the data to be used for the high error value on the X axis.
errorBarY()
When creating an errorBarY plot using data sourced from a table, the following syntax can be used:
errorBarY("SeriesName", source, "x", "y", "yLow", "yHigh")
errorBarY
is the method used to create an errorBarY plot.- "
SeriesName
" is the name (as a string) you want to use to identify the series on the plot itself. source
is the table that holds the data to be used for the plot.- "
x
" is the name of the column of data to be used for the X value. - "
y
" is the name of the column of data to be used for the Y value. - "
yLow
" is the name of the column of data to be used for the low error value on the Y axis. - "
yHigh
" is the name of the column of data to be used for the high error value on the Y axis.
When creating an errorBarY plot using data sourced from an array, the following syntax can be used:
errorBarY("SeriesName", [x], [y], [yLow], [yHigh])
errorBarY
is the method used to create an errorBarY plot.- "
SeriesName
" is the name (as a string) you want to use to identify the series on the chart itself. - "
[x]
" is the array containing the data to be used for the X value. - "
[y]
" is the array containing the data to be used for the Y value. - "
[yLow]
" is the array containing the data to be used for the low error value on the Y axis. - "
[yHigh]
" is the array containing the data to be used for the high error value on the Y axis.
errorBarXY()
When creating an errorBarXY plot using data sourced from a table, the following syntax can be used:
errorBarXY("SeriesName", source, "x", "xLow", "xHigh", "y", "yLow", "yHigh")
errorBarXY
is the method used to create an errorBarXY plot.- "
SeriesName
" is the name (as a string) you want to use to identify the series on the plot itself. source
is the table that holds the data to be used for the plot.- "
x
" is the name of the column of data to be used for the X value. - "
xLow
" is the name of the column of data to be used for the low error value on the X axis. - "
xHigh
" is the name of the column of data to be used for the high error value on the X axis. - "
y
" is the name of the column of data to be used for the Y value. - "
yLow
" is the name of the column of data to be used for the low error value on the Y axis. - "
yHigh
" is the name of the column of data to be used for the high error value on the Y axis.
When creating an errorBarX Plot using data sourced from an array, the following syntax can be used:
errorBarXY("SeriesName", "[x]", [xLow], [xHigh], "[y]", [yLow], [yHigh])
errorBarX
is the method used to create an errorBarX chart.- "
SeriesName
" is the name (as a string) you want to use to identify the series on the chart itself. - "
[x]
" is the array containing the data to be used for the X value. - "
[xLow]
" is the array containing the data to be used for low error value on the X axis. - "
[xHigh]
" is the array containing the data to be used for high error value on the X axis. - "
[y]
" is the array containing the data to be used for the Y value. - "
[yLow]
" is the array containing the data to be used for low error value on the Y axis. - "
[yHigh]
" is the array containing the data to be used for high error value on the Y axis
Example - errorBarY
The query shown in the example below will plot the standard deviation in the value of Google trades every 20 minutes. The resulting plot will be plotted as a line graph with error bars running vertically.
from deephaven import Plot
# source the data
t_EB = db.t("LearnDeephaven", "StockTrades")\
.where("Date = `2017-08-23`", "USym = `GOOG`")\
.updateView("TimeBin=upperBin(Timestamp, 20 * MINUTE)")\
.where("isBusinessTime(TimeBin)")
# calculate standard deviations for the upper and lower error values
t_EB_StdDev = t_EB.by(caf.AggCombo(caf.AggAvg("AvgPrice = Last"), caf.AggStd("StdPrice = Last")), "TimeBin")
# plot the data
ebY_Trades = Plot.errorBarY("Trades: GOOG", t_EB_StdDev.update("AvgPriceLow = AvgPrice - StdPrice",
"AvgPriceHigh = AvgPrice + StdPrice"), "TimeBin", "AvgPrice", "AvgPriceLow", "AvgPriceHigh")\
.show()
//source the data
t_EB = db.t("LearnDeephaven", "StockTrades")
.where("Date = `2017-08-23`", "USym = `GOOG`")
.updateView("TimeBin=upperBin(Timestamp, 20 * MINUTE)")
.where("isBusinessTime(TimeBin)")
//calculate standard deviations for the upper and lower error values
t_EB_StdDev = t_EB.by(AggCombo(AggAvg("AvgPrice = Last"), AggStd("StdPrice = Last")), "TimeBin")
//plot the data
ebY_Trades = errorBarY("Trades: GOOG", t_EB_StdDev.update("AvgPriceLow = AvgPrice - StdPrice", "AvgPriceHigh = AvgPrice + StdPrice"), "TimeBin", "AvgPrice", "AvgPriceLow", "AvgPriceHigh")
.show()
The first code block gathers the data for the plot, telling Deephaven to access the StockTrades
table in the LearnDeephaven
namespace, and then filter it to include only the data for August 23, 2017, and only when the USym
is GOOG
. The data is downsampled (binned) for 20-minute intervals, and is then filtered again to include only the data that occurs within business hours. That data is then saved to a new variable named trades.
The second block of code calculates the average of the values in the Last
column and the standard deviation of the values in the Last
column. The results will appear in new columns, AvgPrice
and StdPrice
respectively, in the new table called tradesStats
.
The third code block tells Deephaven:
- To create an errorBarY plot named
ebY_Trades
. - "Trades: GOOG" is the series name.
- The data needed for plotting is sourced from the
t_EB_StdDev
table. Then, two columns in that table are created in the table using the update method:AvgPriceLow
is calculated by subtracting the standard deviation of the price from the average.AvgPriceHigh
is calculated by adding the average price to the standard deviation.
- Data from the
TimeBin
column is to be used for the X value. - Data from the
AvgPrice
column is to be used as the Y value. - Data from the
AvgPriceLow
column is to be used for the low error value on the Y axis. - Data from the
AvgPriceHigh
column is to be used for the high error value on the Y axis. - The
.show
method then tells Deephaven to present the plot in the Deephaven console.
The resulting plot is presented below.
Customize Error Bar Color
Users can choose a specific color for Error Bars using the errorBarColor
method. In the previous examples, both the error bars and dataset are red. The following query produces green error bars:
from deephaven import Plot
# plot the data
ebY_Trades = Plot.errorBarY("Trades: GOOG", t_EB_StdDev.update("AvgPriceLow = AvgPrice - StdPrice",
"AvgPriceHigh = AvgPrice + StdPrice"), "TimeBin", "AvgPrice", "AvgPriceLow", "AvgPriceHigh").errorBarColor("green").show()
//source the data
t_EB = db.t("LearnDeephaven", "StockTrades")
.where("Date = `2017-08-23`", "USym = `GOOG`")
.updateView("TimeBin=upperBin(Timestamp, 20 * MINUTE)")
.where("isBusinessTime(TimeBin)")
//calculate standard deviations for the upper and lower error values
t_EB_StdDev = t_EB.by(AggCombo(AggAvg("AvgPrice = Last"), AggStd("StdPrice = Last")), "TimeBin")
//plot the data
ebY_Trades = errorBarY("Trades: GOOG", t_EB_StdDev.update("AvgPriceLow = AvgPrice - StdPrice", "AvgPriceHigh = AvgPrice + StdPrice"), "TimeBin", "AvgPrice", "AvgPriceLow", "AvgPriceHigh")
.errorBarColor("green")
.show()
The resulting plot is shown below:
See also Assigning Colors in Deephaven.
Category Plots with Error Bars
catErrorBar()
is the method used to create a Category Plot with Error Bars. This creates a Category Plot with the discrete values on the X axis, numerical values on the Y axis. Error bars can only run vertically when using catErrorBar()
.
Data Sourcing
Category plot with error bars can be created from data sourced from a table or an array.
When data is sourced from a table, the following syntax can be used:
catErrorBar("SeriesName", source, "x", "y", "yLow", "yHigh")
catErrorBarX
the method used to create a category plot with vertical errorBars.- "
SeriesName
" is the name (as a string) you want to use to identify the series on the plot itself. source
is the table that holds the data you want to plot.- "
x
" is the name of the column of data to be used for the X value. - "
y
" is the name of the column of data to be used for the Y value. - "
yLow
" is the name of the column of data to be used for the low error value. - "
yHigh
" is the name of the column of data to be used for the high error value.
When data is sourced from an array, the following syntax can be used:
catErrorBar("SeriesName", [x], [y], [yLow], [yHigh])
catErrorBarX
the method used to create a category plot with vertical errorBars.- "
SeriesName
" is the name (as a string) you want to use to identify the series on the plot itself. - "
[x]
" is the name of the column of data to be used for the X value. - "
[y]
" is the name of the column of data to be used for the Y value. - "
[yLow]
" is the name of the column of data to be used for the low error value on the Y axis. - "
[yHigh]
is the name of the column of data to be used for the high error value on the Y axis.
Example: catErrorBar
The query shown in the example below will plot the standard deviation in the value of Google trades every 20 minutes. The resulting chart will be plotted as a bar graph with error bars running vertically.
from deephaven import Plot
# source the data
t_cat_EB = db.t("LearnDeephaven", "StockTrades")\
.where("Date = `2017-08-23`", "USym = `GOOG`")\
.updateView("TimeBin=upperBin(Timestamp, 20 * MINUTE)")\
.where("isBusinessTime(TimeBin)")
# calculate standard deviations for the upper and lower error values
t_cat_EB_StdDev = t_cat_EB.by(caf.AggCombo(caf.AggAvg("AvgPrice = Last"), caf.AggStd("StdPrice = Last")), "TimeBin")
# plot the data
eb_CatTrades = Plot.catErrorBar("Trades: GOOG", t_cat_EB_StdDev.update("AvgPriceLow = AvgPrice - StdPrice",
"AvgPriceHigh = AvgPrice + StdPrice"),
"TimeBin", "AvgPrice", "AvgPriceLow", "AvgPriceHigh")\
.show()
//source the data
t_cat_EB = db.t("LearnDeephaven", "StockTrades")
.where("Date = `2017-08-23`", "USym = `GOOG`")
.updateView("TimeBin=upperBin(Timestamp, 20 * MINUTE)")
.where("isBusinessTime(TimeBin)")
//calculate standard deviations for the upper and lower error values
t_cat_EB_StdDev = t_cat_EB.by(AggCombo(AggAvg("AvgPrice = Last"), AggStd("StdPrice = Last")), "TimeBin")
//plot the data
eb_CatTrades = catErrorBar("Trades: GOOG", t_cat_EB_StdDev.update("AvgPriceLow = AvgPrice - StdPrice",
"AvgPriceHigh = AvgPrice + StdPrice"),
"TimeBin", "AvgPrice", "AvgPriceLow", "AvgPriceHigh")
.show()
The first code block gathers the data for the plot , telling Deephaven to access the StockTrades
table in the LearnDeephaven
namespace, and then filter it to include only the data for August 23, 2017, and only when the USym
is GOOG
. The data is downsampled (binned) for 20 minute intervals, and is then filtered again to include only the data that occurs within business hours. That data is then saved to a new variable named trades.
The second block of code calculates the average of the values in the Last column and the standard deviation of the values in the Last column. The results will appear in new columns, AvgPrice
and StdPrice
respectively, in the new table called t_EB_StdDev
.
The third code block tells Deephaven:
- To create an
catErrorBar
plot namedeb_CatTrades
. - "
Trades: GOOG
" is the series name. - The data needed for plotting is sourced from the
tradesStats
table. Then, two columns in that table are created in the table using the update method:AvgPriceLow
is calculated by subtracting the standard deviation of the price from the average.AvgPriceHigh
is calculated by adding the average price to the standard deviation.
- Data from the
TimeBin
column is to be used for the X value. - Data from the
AvgPrice
column is to be used as the Y value. - Data from the
AvgPriceLow
column is to be used for low error value on the Y axis. - Data from the
AvgPriceHigh
column is to be used for high error value on the Y axis. - The
.show
method then tells Deephaven to present the plot in the Deephaven console.
The resulting plot is presented below.
For additional formatting options, please refer to:
Last Updated: 16 February 2021 18:07 -04:00 UTC Deephaven v.1.20200928 (See other versions)
Deephaven Documentation Copyright 2016-2020 Deephaven Data Labs, LLC All Rights Reserved