Package io.deephaven.importers.csv
Class CsvTools
java.lang.Object
io.deephaven.importers.csv.CsvTools
Main CSV import tools class.
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptiongetColumnHeaders(@NotNull InputStream stream, @NotNull CsvSpecs specs) Return the column headers as list using the values from the first row.static CsvSpecs.BuildergetCsvSpecsBuilder(@Nullable String format, boolean hasHeader, @Nullable Collection<String> headers, @Nullable List<String> nullLiterals, boolean validateHeaders) Returns the CsvSpecs.Builder created with appropriate properties.static chargetDefaultDelimiter(String format) Returns the default delimiter for the specified format.static CsvSpecs.BuildergetImportCsvSpecsBuilder(@NotNull CsvFormats format, boolean hasHeader, @Nullable Collection<String> headers) Returns the CsvSpecs.Builder that doesn't include source file header validation and legalization to conform to deephaven column header rules.static CsvSpecs.BuildergetImportCsvSpecsBuilder(@Nullable String format, boolean hasHeader, @Nullable Collection<String> headers) Returns the CsvSpecs.Builder that doesn't include source file header validation and legalization to conform to deephaven column header rules.static longimportCsv(@NotNull InputStream stream, CsvSpecs.Builder specBuilder, @NotNull ImportTableWriterFactory tableWriterFactory, @NotNull Logger log, @NotNull List<String> columnNamesInFile, @NotNull Map<String, ImporterColumnDefinition> icdMap, @NotNull Map<String, String> importProperties, @NotNull AtomicInteger errorCount, @NotNull String arrayDelimiter, @Nullable String constantColumnValue, int maxError, boolean strict) Imports and writes the csv data to disk.
-
Constructor Details
-
CsvTools
public CsvTools()
-
-
Method Details
-
getColumnHeaders
@NotNull public static @NotNull List<String> getColumnHeaders(@NotNull @NotNull InputStream stream, @NotNull @NotNull CsvSpecs specs) throws IOException Return the column headers as list using the values from the first row. To avoid reading the entire file the CsvSpecs should restrict the num of rows to be read to 1, this check is enforced.- Parameters:
stream- An InputStream providing access to the CSV data.specs- The CsvSpecs- Returns:
- Return the column headers using the values from the first row.
- Throws:
IOException- throws IOException if a Reader Exception occurs
-
getCsvSpecsBuilder
@NotNull public static CsvSpecs.Builder getCsvSpecsBuilder(@Nullable @Nullable String format, boolean hasHeader, @Nullable @Nullable Collection<String> headers, @Nullable @Nullable List<String> nullLiterals, boolean validateHeaders) throws CsvFormatException Returns the CsvSpecs.Builder created with appropriate properties. Allows user to make the choices for header validation to conform to deephaven column header rules. Users have flexibility to include a null literal value list- Parameters:
format- can be null a delimiter or one of DEFAULT, TDF, EXCEL, MYSQL, RFC4180 and TRIMhasHeader- true or false to indicate if data includes header rowheaders- Column names to use as, or instead of, the header row for the CSV.nullLiterals- The list of null value literals that should be considered as null by the parser, if null is passed then default list consisting of 1. Empty String 2. The string value null 3. The string value of null in parentheses will be usedvalidateHeaders- true indicates headers will be validated with deephaven based column header rules- Returns:
- CsvSpecs.Builder
- Throws:
CsvFormatException- thrown for an unsupported format
-
getDefaultDelimiter
Returns the default delimiter for the specified format. If the format can't be determined it returns ','.- Parameters:
format- the format fromCsvFormats- Returns:
- the default delimiter
-
getImportCsvSpecsBuilder
@NotNull public static CsvSpecs.Builder getImportCsvSpecsBuilder(@Nullable @Nullable String format, boolean hasHeader, @Nullable @Nullable Collection<String> headers) throws CsvFormatException Returns the CsvSpecs.Builder that doesn't include source file header validation and legalization to conform to deephaven column header rules. This method should therefore not be used where deephaven column header rules are expected to be applied. The typical use case for this is in Csv imports where the schema drive the eventual column names. In addition to not validating the headers a default list for Null Literals is used for more information seegetDefaultFormatBuilder(CsvSpecs.Builder, List). The method is public to allow Csv Import related classes to access it.- Parameters:
format- can be null a delimiter or one of DEFAULT, TDF, EXCEL, MYSQL, RFC4180 and TRIMhasHeader- true or false to indicate if data includes header rowheaders- Column names to use as, or instead of, the header row for the CSV.- Returns:
- CsvSpecs.Builder
- Throws:
CsvFormatException- thrown for an unsupported format
-
getImportCsvSpecsBuilder
@NotNull public static CsvSpecs.Builder getImportCsvSpecsBuilder(@NotNull @NotNull CsvFormats format, boolean hasHeader, @Nullable @Nullable Collection<String> headers) throws CsvFormatException Returns the CsvSpecs.Builder that doesn't include source file header validation and legalization to conform to deephaven column header rules. This method should therefore not be used where deephaven column header rules are expected to be applied. The typical use case for this is in Csv imports where the schema drive the eventual column names. In addition to not validating the headers a default list for Null Literals is used for more information seegetDefaultFormatBuilder(CsvSpecs.Builder, List). The method is public to allow Csv Import related classes to access it.- Parameters:
format- the format fromCsvFormatshasHeader- true or false to indicate if data includes header rowheaders- Column names to use as, or instead of, the header row for the CSV.- Returns:
- CsvSpecs.Builder
- Throws:
CsvFormatException- thrown for an unsupported format
-
importCsv
public static long importCsv(@NotNull @NotNull InputStream stream, @NotNull CsvSpecs.Builder specBuilder, @NotNull @NotNull ImportTableWriterFactory tableWriterFactory, @NotNull @NotNull Logger log, @NotNull @NotNull List<String> columnNamesInFile, @NotNull @NotNull Map<String, ImporterColumnDefinition> icdMap, @NotNull @NotNull Map<String, throws IOExceptionString> importProperties, @NotNull @NotNull AtomicInteger errorCount, @NotNull @NotNull String arrayDelimiter, @Nullable @Nullable String constantColumnValue, int maxError, boolean strict) Imports and writes the csv data to disk. From CsvImportTools#importCsv.- Parameters:
stream- InputStream from which to read CSV dataspecBuilder- The CsvSpecs.BuildertableWriterFactory- The passed down ImportTableWriterFactorylog- The passed-down loggercolumnNamesInFile- The column headers in sourceicdMap- The column name to ImporterColumnDefinition mapimportProperties- Provides basic import attributeserrorCount- Holds the current error count across all parsers being used to import csvarrayDelimiter- Delimiter used to parse array data typesconstantColumnValue- A String to materialize as the source column when an ImportColumn is defined with a sourceType of CONSTANT (aka ImporterColumnDefinition$IrisImportConstant). Can be null.maxError- Maximum number of field conversion failures allowedstrict- Whether to fail if a field fails conversion- Returns:
- Returns the number of rows processed
- Throws:
IOException- thrown when encountering an error in the import call
-