Package com.illumon.iris.importers
Interface CsvImporterHelper
- All Superinterfaces:
AutoCloseable
,Closeable
Class to assist with different styles of CSV files
-
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
Close the streamint
Returns the buffer size that will be used when creating a FooterSkipBufferedReaderGet the list of column names from a CSV file; only call after it's been initialized with a streamstatic CsvImporterHelper
getCsvImporterHelper
(String fileFormat, char delimiter, boolean trim, boolean noHeader, int skipHeaderLines, int skipFooterLines, InputStream inputStream, List<String> columnNames, boolean fromSplitFile) Get an appropriate CsvImporterHelper instance.default org.apache.commons.csv.CSVRecord
Parse the next CSV record from the streamlong
processImport
(com.fishlib.io.logger.Logger log, ImportTableWriterFactory writerFactory, Map<String, ImporterColumnDefinition> icdMap, Map<String, String> importProperties, String arrayDelimiter, String constantColumnValue, String currentPartition, AtomicInteger errorCount, int maxError, boolean strict, boolean fromSplitFile) Process the source file or stream and persist to disk as a Tablevoid
setBufferSize
(int bufferSize) Sets the buffer size to use for a FooterSkipBufferedReadervoid
Validate the import.
-
Method Details
-
setBufferSize
void setBufferSize(int bufferSize) Sets the buffer size to use for a FooterSkipBufferedReader- Parameters:
bufferSize
- size of the buffer in characters
-
getBufferSize
int getBufferSize()Returns the buffer size that will be used when creating a FooterSkipBufferedReader- Returns:
- int size of the buffer in characters
-
getColumnNamesFromStream
Get the list of column names from a CSV file; only call after it's been initialized with a stream- Returns:
- the List of column names
-
processImport
long processImport(@NotNull com.fishlib.io.logger.Logger log, @NotNull ImportTableWriterFactory writerFactory, @NotNull Map<String, ImporterColumnDefinition> icdMap, @NotNull Map<String, throws IOExceptionString> importProperties, @NotNull String arrayDelimiter, @Nullable String constantColumnValue, @Nullable String currentPartition, @NotNull AtomicInteger errorCount, int maxError, boolean strict, boolean fromSplitFile) Process the source file or stream and persist to disk as a Table- Parameters:
log
- The passed-down loggerwriterFactory
- The passed down ImportTableWriterFactoryicdMap
- The column name to ImporterColumnDefinition mapimportProperties
- Provides basic import attributesarrayDelimiter
- Delimiter used to parse array data typesconstantColumnValue
- A String to materialize as the source column when an ImportColumn is defined with a sourceType of CONSTANT (aka ImporterColumnDefinition$IrisImportConstant). Can be null.currentPartition
- The current partition value when invoked using splitFileerrorCount
- Holds a record of parse errorsmaxError
- Maximum number of field conversion failures allowedstrict
- Whether to fail if a field fails conversionfromSplitFile
- True if stream is source from an interim split file, split using the partition column- Returns:
- The number of rows processed
- Throws:
IOException
- throws IOException when exceptions occur while reading files
-
close
Close the stream- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
- Throws:
IOException
- if an error occurs
-
validateImport
Validate the import.- Throws:
ImportException
- thrown in case of errors
-
getCsvImporterHelper
static CsvImporterHelper getCsvImporterHelper(String fileFormat, char delimiter, boolean trim, boolean noHeader, int skipHeaderLines, int skipFooterLines, InputStream inputStream, List<String> columnNames, boolean fromSplitFile) throws IOException Get an appropriate CsvImporterHelper instance.- Parameters:
fileFormat
- The file formatdelimiter
- The delimitertrim
- Whether to trim the linesnoHeader
- Indicates that the CSV does not contain a header row with column namesskipHeaderLines
- How many lines to skip at the beginning of the dataskipFooterLines
- How many lines to skip at the end of the datainputStream
- The stream to use for the importcolumnNames
- List of column names to use as a header for the CSV. null if no column names are being passed in.fromSplitFile
- Flag indicating if the import source is a splitFile- Returns:
- the applicable CsvImporterHelper
- Throws:
IOException
- thrown for any type of underlying checked exceptions
-
parseNextRecord
Parse the next CSV record from the stream- Returns:
- the parsed CSVRecord
- Throws:
IOException
- if an error occurs
-