Class ParquetInstructions

java.lang.Object
com.illumon.iris.db.v2.locations.parquet.ParquetInstructions
All Implemented Interfaces:
ColumnToCodecMappings

public abstract class ParquetInstructions extends Object implements ColumnToCodecMappings
This class provides instructions intended for read and write parquet operations (which take it as an optional argument) specifying desired transformations. Examples are mapping column names and use of specific codecs during (de)serialization.
  • Field Details

  • Constructor Details

    • ParquetInstructions

      public ParquetInstructions()
  • Method Details

    • setDefaultCompressionCodecName

      public static void setDefaultCompressionCodecName(String name)
      Set the default for getCompressionCodecName().
      Parameters:
      name - The new default
      See Also:
    • getDefaultCompressionCodecName

      public static String getDefaultCompressionCodecName()
      Returns:
      The default for getCompressionCodecName()
    • setDefaultMaximumDictionaryKeys

      public static void setDefaultMaximumDictionaryKeys(int maximumDictionaryKeys)
      Set the default for getMaximumDictionaryKeys().
      Parameters:
      maximumDictionaryKeys - The new default
      See Also:
    • getDefaultMaximumDictionaryKeys

      public static int getDefaultMaximumDictionaryKeys()
      Returns:
      The default for getMaximumDictionaryKeys()
    • setDefaultTargetPageSize

      public static void setDefaultTargetPageSize(int newDefaultSizeBytes)
      Set the default target page size (in bytes) used to section rows of data into pages during column writing. This number should be no smaller than 65536.
      Parameters:
      newDefaultSizeBytes - the new default target page size.
    • getDefaultTargetPageSize

      public static int getDefaultTargetPageSize()
      Get the current default target page size in bytes.
      Returns:
      the current default target page size in bytes.
    • getColumnNameFromParquetColumnNameOrDefault

      public final String getColumnNameFromParquetColumnNameOrDefault(String parquetColumnName)
    • getParquetColumnNameFromColumnNameOrDefault

      public abstract String getParquetColumnNameFromColumnNameOrDefault(String columnName)
    • getColumnNameFromParquetColumnName

      public abstract String getColumnNameFromParquetColumnName(String parquetColumnName)
    • getCodecName

      public abstract String getCodecName(String columnName)
      Specified by:
      getCodecName in interface ColumnToCodecMappings
    • getCodecArgs

      public abstract String getCodecArgs(String columnName)
      Specified by:
      getCodecArgs in interface ColumnToCodecMappings
    • useDictionary

      public abstract boolean useDictionary(String columnName)
      Returns:
      A hint that the writer should use dictionary-based encoding for writing this column; never evaluated for non-String columns, defaults to false
    • getCompressionCodecName

      public abstract String getCompressionCodecName()
    • getMaximumDictionaryKeys

      public abstract int getMaximumDictionaryKeys()
      Returns:
      The maximum number of unique keys the writer should add to a dictionary page before switching to non-dictionary encoding; never evaluated for non-String columns, ignored if useDictionary(String)
    • isLegacyParquet

      public abstract boolean isLegacyParquet()
    • getTargetPageSize

      public abstract int getTargetPageSize()
    • sameColumnNamesAndCodecMappings

      @VisibleForTesting public static boolean sameColumnNamesAndCodecMappings(ParquetInstructions i1, ParquetInstructions i2)
    • builder

      public static ParquetInstructions.Builder builder()