Package io.deephaven.parquet.base
Class ParquetFileReader
java.lang.Object
io.deephaven.parquet.base.ParquetFileReader
Top level accessor for a parquet file which can read both from a file path string or a CLI style file URI,
ex."s3://bucket/key".
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final Stringfinal org.apache.parquet.format.FileMetaData -
Method Summary
Modifier and TypeMethodDescriptionstatic ParquetFileReadercreate(@NotNull File parquetFile, @NotNull SeekableChannelsProvider channelsProvider) Make aParquetFileReaderfor the suppliedFile.static ParquetFileReadercreate(@NotNull URI parquetFileURI, @NotNull SeekableChannelsProvider channelsProvider) Make aParquetFileReaderfor the suppliedURI.Get the name of all columns that we can know for certain (a) have a dictionary, and (b) use the dictionary on all data pages.getRowGroup(int groupNumber, String version) Create aRowGroupReaderobject for provided row group numberorg.apache.parquet.schema.MessageTypeint
-
Field Details
-
FILE_URI_SCHEME
- See Also:
-
fileMetaData
public final org.apache.parquet.format.FileMetaData fileMetaData
-
-
Method Details
-
create
public static ParquetFileReader create(@NotNull @NotNull File parquetFile, @NotNull @NotNull SeekableChannelsProvider channelsProvider) - Parameters:
parquetFile- The parquet file or the parquet metadata filechannelsProvider- TheSeekableChannelsProviderto use for reading the file- Returns:
- The new
ParquetFileReader
-
create
public static ParquetFileReader create(@NotNull @NotNull URI parquetFileURI, @NotNull @NotNull SeekableChannelsProvider channelsProvider) - Parameters:
parquetFileURI- The URI for the parquet file or the parquet metadata filechannelsProvider- TheSeekableChannelsProviderto use for reading the file- Returns:
- The new
ParquetFileReader
-
getChannelsProvider
- Returns:
- The
SeekableChannelsProviderused for this reader, appropriate to use for related file access
-
getColumnsWithDictionaryUsedOnEveryDataPage
Get the name of all columns that we can know for certain (a) have a dictionary, and (b) use the dictionary on all data pages.- Returns:
- A set of parquet column names that satisfies the required condition.
-
getRowGroup
Create aRowGroupReaderobject for provided row group number- Parameters:
version- The "version" string from deephaven specific parquet metadata, or null if it's not present.
-
getSchema
public org.apache.parquet.schema.MessageType getSchema() -
rowGroupCount
public int rowGroupCount()
-