Integrating R with Deephaven

R is an open-source programming language and software environment that is commonly used for statistical analysis, graphical representation and reporting.

Before you can integrate R with Deephaven , you should install and run the Deephaven Launcher, create the appropriate Deephaven instance and workspace as required by your enterprise, and then connect to the Instance. Connecting to the appropriate instance will result in the client downloading necessary resource files from the server, which will be required for the R integration.

See: Installation Guide for Users for additional information.

Download the R setup notebook

Setup: R, rJava, and Auth Key

The following steps are required to install R and integrate R with Deephaven . Note: The steps below show the default paths used on Windows-based PCs. Your paths may differ depending on your operating system, version numbers, and if you chose to use non-default installation locations.

  1. [Windows-only] Add the JVM shared library path to the path.
  1. If you installed the version of the Deephaven launcher that included the JDK, the complete path should be similar to the following:

C:\Users\<userName>\AppData\Local\Illumon\jdk\jre\bin\server\

  1. If you installed the JDK separately, the complete path should be similar to the following:

C:\Program Files\Java\jdk_version\jre\bin\server

  1. If R is not installed, install R. See: https://cran.cnr.berkeley.edu/
  2. If RStudio is not installed, consider installing RStudio. RStudio is not required, but it does provide a nice IDE. See: https://www.rstudio.com/products/rstudio/download/
  3. Start an R console through the standard R installation or through RStudio.
  4. Install rJava by running the following in the R console.
  1. If you installed the version of the Deephaven Launcher that included the JDK, run the following in the R console:

Sys.setenv(JAVA_HOME="C:\\Users\\<userName>\\AppData\\Local\\Illumon\\jdk\\")
install.packages("rJava")

  1. If you installed the JDK separately and your system does not have a default value for JAVA_HOME, run the following in the R console:

Sys.setenv(JAVA_HOME="C:\\Program Files\\Java\\jdk_version\\")
install.packages("rJava")

  1. Test the rJava installation by running the following in the R console

library(rJava)
.jinit() # this starts the JVM
s <- .jnew("java/lang/String", "Hello World!")
print(s)

Note: you must restart the R session before integrating with Deephaven if you explicitly call .jinit(). The integration library takes care of JVM initialization for you.

  1. Set up Deephaven Authorization keys on the Deephaven server by executing the following on the server:

/usr/illumon/latest/bin/generate_iris_keys <irisUserName>
cat pub-<irisUserName>.base64.txt >> /etc/sysconfig/illumon.d/resources/dsakeys.txt

  1. Copy the user's private auth key ( priv-<irisUserName>.base64.txt ) to the user's home directory.

R API (irisdb.R)

Load the Deephaven (Iris) R library.

source("C:\\Users\\<userName>\\AppData\\Local\\Illumon\\<instance>\\integrations\\r\\irisdb.R")

Initialize the Deephaven integration.

idb.init(devroot, workspace, propfile, userHome, keyfile, librarypath, log4jconffile, workerHeapGB, jvmHeapGB, verbose, jvmArgs)

Note: only the devroot, workspace, propfile, and keyfile arguments are required.

Execute Groovy code.

idb.execute('groovy')

Note: because Deephaven queries heavily utilize double quotes, use single quotes to encapsulate string literals.

Execute Groovy code contained in a file.

idb.executeFile('filepath')

Get a variable from the Groovy shell.

idb.get('variable')

Get a variable from the Groovy shell as an R data frame or converts a Deephaven table to an R data frame.

idb.get.df('variable')

Get a Deephaven database object. (Because the syntax is more nasty, you should only use this if you have a good reason not to use the Groovy functionality.)

idb.db()

Create the table variable "name" in the Groovy shell with the data in the data frame df.

idb.push.df('name', df)

Examples

Download the R example notebook

Initialize Deephaven

#Set the environment variable for the JDK
Sys.setenv(JAVA_HOME="C:\\Users\\<userName>\\AppData\\Local\\Illumon\\jdk\\")

#If JDK installed separately from Iris Launcher
#Sys.setenv(JAVA_HOME="C:\\Program Files\\Java\\jdk_version\\")
install.packages("rJava")
library(rJava)

#devroot requires trailing slash
devroot = "C:\\Users\\<userName>\\AppData\\Local\\Illumon\\<instance>\\"
workspace = "C:\\Users\\<userName>\\Documents\\Iris\\<instance>\\workspaces\\<workspace>\\"
propfile = "iris-common.prop"
workerHeapGB <- 2
jvmHeapGB <- 2
keyfile <- "C:\\Users\\<userName>\\priv-<irisUserName>.base64.txt"
verbose <- TRUE
source("C:\\Users\\<userName>\\AppData\\Local\\Illumon\\<instance>\\integrations\\r\\irisdb.R")
idb.init(devroot, workspace, propfile, workerHeapGB=workerHeapGB, jvmHeapGB=jvmHeapGB, keyfile=keyfile, verbose=verbose)

Note: Appropriate file paths can also be found in the bottom right corner of the Launcher. Also, you can determine devroot, workspace, and propfile properties for idb.init through environment variables rather than string arguments. For example:

Sys.setenv(ILLUMON_DEVROOT = "C:\\Users\\<userName>\\AppData\\Local\\Illumon\\<instance>\\")
Sys.setenv(ILLUMON_WORKSPACE = "C:\\Users\\<userName>\\Documents\\Iris\\<instance>\\workspaces\\<workspace>\\")
Sys.setenv(ILLUMON_PROPFILE = "iris-common.prop")

Execute Simple Groovy Commands

# Set and get a variable
idb.execute('b=4')
print(idb.get('b'))

Execute a Groovy File

# Execute commands from a file
idb.executeFile('test.groovy')
print(idb.get('c'))
print(idb.get('d'))

Execute a Groovy Query to get an R Data Frame

idb.execute('
   t1 = emptyTable(100).update("Type= i%2==0 ? `A` : `B`","X=i","Y=X*X");
   t2 = t1.sumBy("Type")
    ')
t1df <- idb.get.df('t1')
t2df <- idb.get.df('t2')
show(t2df)

Execute a Query Without Using Groovy

The Groovy interface is much nicer. You should have a good reason not to use it.

# Execute a non-groovy query
t <- tryCatch( wdb.db()$getTable("SystemEQ","Trades"), Exception = function(e){e$jobj$printStackTrace()} )
t2 <- t$where(.jarray('Date=`2014-04-11`'))

print(t2$size())
print(t2$getColumn("Price")$getDouble(2L))

tt <- t$selectDistinct(.jarray("Date"))$sortDescending(.jarray("Date"))
print(tt$getColumn('Date')$get(0L))

Print Information on Available Java Methods

# print class info
.jmethods('com.illumon.iris.db.tables.Table')
.jmethods('com.illumon.integrations.common.IrisIntegrationGroovySession')

Print Java Classpath

print(.jclassPath())

Export Data Frames from R and Import to Deephaven

To export the R data frame df as an in-memory Deephaven Groovy table named myRTable, use the following:

idb.push.df('myRTable', df)

The in-memory Deephaven Groovy table is not permanently stored and cannot be accessed outside of the process in which it was created. You can use myRTable in R as you would any other Deephaven Groovy variable in the R API. For example:

idb.execute('t = myRTable.updateView("n2 = n*n")')

Like any other Deephaven table, myRTable can be saved for later use by using the following:

db.addTable(namespace,tablename,table).

The example below saves myRTable as MyDeephavenTableFromR in the MyNamespace namespace.

idb.execute('db.addTable("MyNameSpace","MyDeephavenTableFromR",myRTable)')

To load the saved R table from the console, use:

table = db.t("MyNameSpace","MyDeephavenTableFromR")

Best Practices

  • Do as much work as possible in Deephaven. Use R for the final analysis of a small, distilled data set.
  • Be conscious of the size of tables converted to in-memory R tables. The R session must have enough RAM to store the table.