Package com.illumon.iris.importers
Interface CsvImporterHelper
- All Superinterfaces:
AutoCloseable,Closeable
Class to assist with different styles of CSV files
-
Method Summary
Modifier and TypeMethodDescriptionvoidclose()Close the streamintReturns the buffer size that will be used when creating a FooterSkipBufferedReaderGet the list of column names from a CSV file; only call after it's been initialized with a streamstatic CsvImporterHelpergetCsvImporterHelper(String fileFormat, char delimiter, boolean trim, boolean noHeader, int skipHeaderLines, int skipFooterLines, InputStream inputStream, List<String> columnNames, boolean fromSplitFile) Get an appropriate CsvImporterHelper instance.default org.apache.commons.csv.CSVRecordParse the next CSV record from the streamlongprocessImport(com.fishlib.io.logger.Logger log, ImportTableWriterFactory writerFactory, Map<String, ImporterColumnDefinition> icdMap, Map<String, String> importProperties, String arrayDelimiter, String constantColumnValue, String currentPartition, AtomicInteger errorCount, int maxError, boolean strict, boolean fromSplitFile) Process the source file or stream and persist to disk as a TablevoidsetBufferSize(int bufferSize) Sets the buffer size to use for a FooterSkipBufferedReadervoidValidate the import.
-
Method Details
-
setBufferSize
void setBufferSize(int bufferSize) Sets the buffer size to use for a FooterSkipBufferedReader- Parameters:
bufferSize- size of the buffer in characters
-
getBufferSize
int getBufferSize()Returns the buffer size that will be used when creating a FooterSkipBufferedReader- Returns:
- int size of the buffer in characters
-
getColumnNamesFromStream
Get the list of column names from a CSV file; only call after it's been initialized with a stream- Returns:
- the List of column names
-
processImport
long processImport(@NotNull com.fishlib.io.logger.Logger log, @NotNull ImportTableWriterFactory writerFactory, @NotNull Map<String, ImporterColumnDefinition> icdMap, @NotNull Map<String, throws IOExceptionString> importProperties, @NotNull String arrayDelimiter, @Nullable String constantColumnValue, @Nullable String currentPartition, @NotNull AtomicInteger errorCount, int maxError, boolean strict, boolean fromSplitFile) Process the source file or stream and persist to disk as a Table- Parameters:
log- The passed-down loggerwriterFactory- The passed down ImportTableWriterFactoryicdMap- The column name to ImporterColumnDefinition mapimportProperties- Provides basic import attributesarrayDelimiter- Delimiter used to parse array data typesconstantColumnValue- A String to materialize as the source column when an ImportColumn is defined with a sourceType of CONSTANT (aka ImporterColumnDefinition$IrisImportConstant). Can be null.currentPartition- The current partition value when invoked using splitFileerrorCount- Holds a record of parse errorsmaxError- Maximum number of field conversion failures allowedstrict- Whether to fail if a field fails conversionfromSplitFile- True if stream is source from an interim split file, split using the partition column- Returns:
- The number of rows processed
- Throws:
IOException- throws IOException when exceptions occur while reading files
-
close
Close the stream- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceCloseable- Throws:
IOException- if an error occurs
-
validateImport
Validate the import.- Throws:
ImportException- thrown in case of errors
-
getCsvImporterHelper
static CsvImporterHelper getCsvImporterHelper(String fileFormat, char delimiter, boolean trim, boolean noHeader, int skipHeaderLines, int skipFooterLines, InputStream inputStream, List<String> columnNames, boolean fromSplitFile) throws IOException Get an appropriate CsvImporterHelper instance.- Parameters:
fileFormat- The file formatdelimiter- The delimitertrim- Whether to trim the linesnoHeader- Indicates that the CSV does not contain a header row with column namesskipHeaderLines- How many lines to skip at the beginning of the dataskipFooterLines- How many lines to skip at the end of the datainputStream- The stream to use for the importcolumnNames- List of column names to use as a header for the CSV. null if no column names are being passed in.fromSplitFile- Flag indicating if the import source is a splitFile- Returns:
- the applicable CsvImporterHelper
- Throws:
IOException- thrown for any type of underlying checked exceptions
-
parseNextRecord
Parse the next CSV record from the stream- Returns:
- the parsed CSVRecord
- Throws:
IOException- if an error occurs
-