Class TableManagementTools

java.lang.Object
com.illumon.iris.db.tables.utils.TableManagementTools

public class TableManagementTools extends Object
Tools for managing and manipulating tables on disk. Most users will need TableTools and not TableManagementTools.
  • Field Details

  • Method Details

    • readTable

      public static Table readTable(@NotNull File path) throws TableDataException
      Reads in a table from disk. This method will attempt to determine the table type. If the type is known ahead of time, it is more efficient to invoke either readTable(File, StorageFormat, SourceTableInstructions) or readTable(File, TableDefinition)
      Parameters:
      path - the path to the table on disk.
      Returns:
      the table read at the location.
      Throws:
      TableDataException - if the table is Deephaven format and the TableDataException could not be loaded
    • readTable

      public static Table readTable(@NotNull File path, @NotNull Database.StorageFormat formatHint, @NotNull SourceTableInstructions sourceTableInstructions) throws TableDataException
      Read a table from disk with the specified format.
      Parameters:
      path - the path to the table
      formatHint - the expected format on disk
      Returns:
      the table read from disk.
      Throws:
      TableDataException - if the TableDataException could not be loaded
    • readTable

      public static Table readTable(@NotNull File path, @NotNull TableDefinition tableDefinition)
      Reads in a table from disk.
      Parameters:
      path - table location
      tableDefinition - table definition
      Returns:
      table
    • readTable

      public static Table readTable(@NotNull File path, @NotNull TableDefinition tableDefinition, @NotNull SourceTableInstructions instructions)
      Reads in a table from disk.
      Parameters:
      path - table location
      tableDefinition - table definition
      instructions - SourceTableInstructions for column and region creation
      Returns:
      table
    • writeTable

      public static void writeTable(@NotNull Table sourceTable, @NotNull String destDir)
      Write out a table to disk.
      Parameters:
      sourceTable - source table
      destDir - destination
    • writeTable

      public static void writeTable(@NotNull Table sourceTable, @NotNull String destDir, @NotNull Database.StorageFormat storageFormat)
      Write out a table to disk.
      Parameters:
      sourceTable - source table
      destDir - destination
      storageFormat - Format used for storage
    • writeTable

      public static void writeTable(@NotNull Table sourceTable, @NotNull TableDefinition definition, @NotNull File destDir, @NotNull Database.StorageFormat storageFormat)
      Write out a table to disk.
      Parameters:
      sourceTable - source table
      definition - table definition. Will be written to disk as given.
      destDir - destination
      storageFormat - Format used for storage
    • writeTable

      public static void writeTable(@NotNull Table sourceTable, @NotNull File destDir)
      Write out a table to disk.
      Parameters:
      sourceTable - source table
      destDir - destination
    • writeTable

      public static void writeTable(@NotNull Table sourceTable, @NotNull File destDir, @NotNull Database.StorageFormat storageFormat)
      Write out a table to disk.
      Parameters:
      sourceTable - source table
      destDir - destination
      storageFormat - Format used for storage
    • writeParquetTables

      public static void writeParquetTables(@NotNull Table[] sources, @NotNull TableDefinition tableDefinition, @Nullable org.apache.parquet.hadoop.metadata.CompressionCodecName codecName, @NotNull File[] destinations, @NotNull String[] groupingColumns)
      Writes tables to disk in parquet format under a given destinations. If you specify grouping columns, there must already be grouping information for those columns in the sources. This can be accomplished with .by(<grouping columns>).ungroup() or .sort(<grouping column>).
      Parameters:
      sources - The tables to write
      tableDefinition - The common schema for all the tables to write
      codecName - Compression codec to use. The only supported codecs are CompressionCodecName.SNAPPY and CompressionCodecName.UNCOMPRESSED.
      destinations - The destination paths. If the parquet extension is missing, the default parquet file name is used (table.parquet)
      groupingColumns - List of columns the tables are grouped by (the write operation will store the grouping info)
    • writeTables

      public static void writeTables(@NotNull Table[] sources, @NotNull TableDefinition tableDefinition, @NotNull File[] destinations)
      Write out tables to disk.
      Parameters:
      sources - source tables
      tableDefinition - table definition
      destinations - destinations
    • writeTables

      public static void writeTables(@NotNull Table[] sources, @NotNull TableDefinition tableDefinition, @NotNull File[] destinations, @Nullable Database.StorageFormat storageFormat)
      Write out tables to disk.
      Parameters:
      sources - source tables
      tableDefinition - table definition
      destinations - destinations
      storageFormat - Format used for storage
    • writeDeephavenTables

      public static void writeDeephavenTables(@NotNull Table[] sources, @NotNull TableDefinition tableDefinition, @NotNull File[] destinations)
      Write out tables to disk in the Deephaven format. All tables are assumed to share the given TableDefinition, which will be written to all locations as passed in.
      Parameters:
      sources - source tables
      tableDefinition - table definition
      destinations - destinations
    • addGroupingMetadata

      public static void addGroupingMetadata(@NotNull File tableDirectory)
      Add grouping metadata to a table on disk.
      Parameters:
      tableDirectory - table directory
    • addGroupingMetadata

      public static void addGroupingMetadata(@NotNull File tableDirectory, @NotNull TableDefinition tableDefinition)
      Add grouping metadata to a table on disk.
      Parameters:
      tableDirectory - table directory
      tableDefinition - table definition
    • deleteTable

      public static void deleteTable(@NotNull File path)
      Deletes a table on disk.
      Parameters:
      path - path to delete
    • appendToTable

      public static void appendToTable(@NotNull Table tableToAppend, @NotNull String destDir)
      Appends to an existing table on disk, or writes a new table if the target table does not exist.
      Parameters:
      tableToAppend - table to append
      destDir - destination
    • appendToTables

      public static void appendToTables(@NotNull TableDefinition definitionToAppend, @NotNull Table[] tablesToAppend, @NotNull String[] destinationDirectoryNames)
      Appends to existing tables on disk, or writes a new table if the target table does not exist.
      Parameters:
      definitionToAppend - table definition
      tablesToAppend - tables to append
      destinationDirectoryNames - destination directories
    • flushColumnData

      public static void flushColumnData()
      Flush all previously written column data to disk.
    • getAllDbDirs

      public static List<File> getAllDbDirs(@NotNull String tableName, @NotNull File rootDir, int levelsDepth)
      Gets all sub-table directories.
      Parameters:
      tableName - table name
      rootDir - root directory where tables are found
      levelsDepth - levels below rootDir where table directories are found
      Returns:
      all sub-table directories for tableName
    • dropColumns

      public static TableDefinition dropColumns(@NotNull TableDefinition currentDefinition, @NotNull File rootDir, int levels, String... columnsToRemove)
      Removes columns from a table definition and persists the result in path, potentially updating multiple persisted tables.
      Parameters:
      currentDefinition - initial table definition.
      rootDir - root directory where tables are found.
      levels - levels below rootDir where table directories are found.
      columnsToRemove - columns to remove.
      Returns:
      new table definition if successful; otherwise null.
    • dropColumns

      public static TableDefinition dropColumns(@NotNull TableDefinition currentDefinition, @NotNull File path, String... columnsToRemove)
      Removes columns from a table definition and persists the result in path, potentially updating multiple persisted tables.
      Parameters:
      currentDefinition - initial table definition.
      path - path of the table receiving the definition.
      columnsToRemove - columns to remove.
      Returns:
      new table definition.
    • renameColumns

      public static TableDefinition renameColumns(@NotNull TableDefinition currentDefinition, @NotNull File rootDir, int levels, String... columnsToRename) throws IOException
      Renames columns in a table definition and persists the result in path, potentially updating multiple persisted tables.
      Parameters:
      currentDefinition - initial table definition.
      rootDir - root directory where tables are found.
      levels - levels below rootDir where table directories are found.
      columnsToRename - columns to rename.
      Returns:
      new table definition.
      Throws:
      IOException
    • renameColumns

      public static TableDefinition renameColumns(@NotNull TableDefinition currentDefinition, @NotNull File path, String... columnsToRename) throws IOException
      Renames columns in a table definition and persists the result in path, potentially updating multiple persisted tables.
      Parameters:
      currentDefinition - initial table definition.
      path - path of the table receiving the definition.
      columnsToRename - columns to rename.
      Returns:
      new table definition.
      Throws:
      IOException
    • renameColumns

      public static TableDefinition renameColumns(@NotNull TableDefinition currentDefinition, @NotNull File path, MatchPair... columnsToRename) throws IOException
      Renames columns in a table definition and persists the result in path, potentially updating multiple persisted tables.
      Parameters:
      currentDefinition - initial table definition.
      path - path of the table receiving the definition.
      columnsToRename - columns to rename.
      Returns:
      new table definition.
      Throws:
      IOException
    • updateColumns

      public static TableDefinition updateColumns(@NotNull TableDefinition currentDefinition, @NotNull File rootDir, int levels, String... updates) throws IOException
      Updates columns in a table definition and persists the result in path, potentially updating multiple persisted tables.
      Parameters:
      currentDefinition - initial table definition.
      rootDir - root directory where tables are found.
      levels - levels below rootDir where table directories are found.
      updates - columns to update.
      Returns:
      new table definition.
      Throws:
      IOException
    • addColumns

      public static TableDefinition addColumns(@NotNull TableDefinition currentDefinition, @NotNull File rootDir, int levels, String... columnsToAdd) throws IOException
      Adds new columns to a table definition and persists the result in path. If there is an exception, the current definition is persisted.
      Parameters:
      currentDefinition - initial table definition.
      rootDir - root directory where tables are found.
      levels - levels below rootDir where table directories are found.
      columnsToAdd - columns to add.
      Returns:
      new table definition.
      Throws:
      IOException
    • addColumns

      public static TableDefinition addColumns(@Nullable TableDefinition currentDefinition, @NotNull File path, String... columnsToAdd) throws IOException
      Adds new columns to a table definition and persists the result in path. If there is an exception, the current definition is persisted.
      Parameters:
      currentDefinition - initial table definition. If null, the definition from the table loaded from path is used.
      path - path of the table containing the columns to add.
      columnsToAdd - columns to add.
      Returns:
      new table definition if successful; otherwise null.
      Throws:
      IOException
    • readDataIndexTable

      @Nullable public static Table readDataIndexTable(@NotNull File tablePath, @NotNull String... columnNames)
      Read a grouping table written by writeDataIndexTable(File, Table, String, String...) (ColumnDefinition, File, Table)}.
      Parameters:
      tablePath - the path to the source table
      columnNames - the column to locate grouping for
      Returns:
      a Table containing the groupings or null if it does not exist
    • writeDataIndexTable

      public static void writeDataIndexTable(@NotNull File destinationDir, @NotNull Table indexTable, @NotNull String indexColumnName, @NotNull String... keyColumnNames)
      Write out the Data Index table for the specified columns. This will place the Data Index in a table adjacent to the data table in a directory titled "Index-<Column names>".
      Parameters:
      destinationDir - the destination for the source table
      indexTable - the table containing the index
      indexColumnName - the name of the Index column
      keyColumnNames - the ordered names of key columns
    • getDataIndexFile

      public static File getDataIndexFile(@NotNull File destinationDir, String... columnNames)
      Get the directory path for the data index for the specified columns. The column names will always be sorted first.
      Parameters:
      destinationDir - the base path
      columnNames - the columns indexed
      Returns:
      the directory where the specified data index should go.