Class ParquetInstructions

java.lang.Object
io.deephaven.parquet.table.ParquetInstructions
All Implemented Interfaces:
ColumnToCodecMappings

public abstract class ParquetInstructions extends Object implements ColumnToCodecMappings
This class provides instructions intended for read and write parquet operations (which take it as an optional argument) specifying desired transformations. Examples are mapping column names and use of specific codecs during (de)serialization.
  • Field Details

    • MIN_TARGET_PAGE_SIZE

      public static final int MIN_TARGET_PAGE_SIZE
    • EMPTY

      public static final ParquetInstructions EMPTY
  • Constructor Details

    • ParquetInstructions

      public ParquetInstructions()
  • Method Details

    • setDefaultCompressionCodecName

      @Deprecated public static void setDefaultCompressionCodecName(String name)
      Set the default for getCompressionCodecName().
      Parameters:
      name - The new default
    • getDefaultCompressionCodecName

      public static String getDefaultCompressionCodecName()
      Returns:
      The default for getCompressionCodecName()
    • setDefaultMaximumDictionaryKeys

      public static void setDefaultMaximumDictionaryKeys(int maximumDictionaryKeys)
      Set the default for getMaximumDictionaryKeys().
      Parameters:
      maximumDictionaryKeys - The new default
      See Also:
    • getDefaultMaximumDictionaryKeys

      public static int getDefaultMaximumDictionaryKeys()
      Returns:
      The default for getMaximumDictionaryKeys()
    • setDefaultMaximumDictionarySize

      public static void setDefaultMaximumDictionarySize(int maximumDictionarySize)
      Set the default for getMaximumDictionarySize().
      Parameters:
      maximumDictionarySize - The new default
      See Also:
    • getDefaltMaximumDictionarySize

      public static int getDefaltMaximumDictionarySize()
      Returns:
      The default for getMaximumDictionarySize()
    • setDefaultTargetPageSize

      public static void setDefaultTargetPageSize(int newDefaultSizeBytes)
      Set the default target page size (in bytes) used to section rows of data into pages during column writing. This number should be no smaller than MIN_TARGET_PAGE_SIZE.
      Parameters:
      newDefaultSizeBytes - the new default target page size.
    • getDefaultTargetPageSize

      public static int getDefaultTargetPageSize()
      Get the current default target page size in bytes.
      Returns:
      the current default target page size in bytes.
    • getColumnNameFromParquetColumnNameOrDefault

      public final String getColumnNameFromParquetColumnNameOrDefault(String parquetColumnName)
    • getParquetColumnNameFromColumnNameOrDefault

      public abstract String getParquetColumnNameFromColumnNameOrDefault(String columnName)
    • getColumnNameFromParquetColumnName

      public abstract String getColumnNameFromParquetColumnName(String parquetColumnName)
    • getCodecName

      public abstract String getCodecName(String columnName)
      Specified by:
      getCodecName in interface ColumnToCodecMappings
    • getCodecArgs

      public abstract String getCodecArgs(String columnName)
      Specified by:
      getCodecArgs in interface ColumnToCodecMappings
    • useDictionary

      public abstract boolean useDictionary(String columnName)
      Returns:
      A hint that the writer should use dictionary-based encoding for writing this column; never evaluated for non-String columns, defaults to false
    • getSpecialInstructions

      public abstract Object getSpecialInstructions()
    • getCompressionCodecName

      public abstract String getCompressionCodecName()
    • getMaximumDictionaryKeys

      public abstract int getMaximumDictionaryKeys()
      Returns:
      The maximum number of unique keys the writer should add to a dictionary page before switching to non-dictionary encoding; never evaluated for non-String columns, ignored if useDictionary(String)
    • getMaximumDictionarySize

      public abstract int getMaximumDictionarySize()
      Returns:
      The maximum number of bytes the writer should add to a dictionary before switching to non-dictionary encoding; never evaluated for non-String columns, ignored if useDictionary(String)
    • isLegacyParquet

      public abstract boolean isLegacyParquet()
    • getTargetPageSize

      public abstract int getTargetPageSize()
    • isRefreshing

      public abstract boolean isRefreshing()
      Returns:
      if the data source is refreshing
    • sameColumnNamesAndCodecMappings

      @VisibleForTesting public static boolean sameColumnNamesAndCodecMappings(ParquetInstructions i1, ParquetInstructions i2)
    • builder

      public static ParquetInstructions.Builder builder()