Class CsvParserEx

java.lang.Object
org.apache.commons.csv.CsvParserEx
All Implemented Interfaces:
Closeable, AutoCloseable, Iterable<org.apache.commons.csv.CSVRecord>

public final class CsvParserEx extends Object implements Iterable<org.apache.commons.csv.CSVRecord>, Closeable
Parses CSV files according to the specified format. Because CSV appears in many different dialects, the parser supports many formats by allowing the specification of a CSVFormat. The parser works record wise. It is not possible to go back, once a record has been parsed from the input stream.

Creating instances

Parsers can only be created by passing a LineOrientedReader directly to the sole constructor.

Parsing record wise

To parse a CSV input from a file, you write:

 File csvData = new File("/path/to/csv");
 CSVParser parser = CSVParser.parse(csvData, CSVFormat.RFC4180);
 for (CSVRecord csvRecord : parser) {
     ...
 }
 

This will read the parse the contents of the file using the RFC 4180 format.

To parse CSV input in a format like Excel, you write:

 CSVParser parser = CSVParser.parse(csvData, CSVFormat.EXCEL);
 for (CSVRecord csvRecord : parser) {
     ...
 }
 

If the predefined formats don't match the format at hands, custom formats can be defined. More information about customising CSVFormats is available in CSVFormat JavaDoc.

Parsing into memory

If parsing record wise is not desired, the contents of the input can be read completely into memory.

 Reader in = new StringReader("a;b\nc;d");
 CSVParser parser = new CSVParser(in, CSVFormat.EXCEL);
 List<CSVRecord> list = parser.getRecords();
 

There are two constraints that have to be kept in mind:

  1. Parsing into memory starts at the current position of the parser. If you have already parsed records from the input, those records will not end up in the in memory representation of your CSV data.
  2. Parsing into memory may consume a lot of system resources depending on the input. For example if you're parsing a 150MB file of CSV data the contents will be read completely into memory.

Notes

Internal parser state is completely covered by the format and the reader-state.

See Also:
  • Constructor Details

    • CsvParserEx

      public CsvParserEx(LineOrientedReader reader, org.apache.commons.csv.CSVFormat format) throws IOException
      Customized CSV parser using the given CSVFormat

      If you do not read all records from the given reader, you should call close() on the parser, unless you close the reader.

      Parameters:
      reader - a Reader containing CSV-formatted input. Must not be null.
      format - the CSVFormat used for CSV parsing. Must not be null.
      Throws:
      IllegalArgumentException - If the parameters of the format are inconsistent or if either reader or format are null.
      IOException - If there is a problem reading the header or skipping the first record
  • Method Details

    • close

      public void close() throws IOException
      Closes resources.
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Throws:
      IOException - If an I/O error occurs
    • getCurrentLineNumber

      public long getCurrentLineNumber()
      Returns the current line number in the input stream.

      ATTENTION: If your CSV input has multi-line values, the returned number does not correspond to the record number.

      Returns:
      current line number
    • getHeaderMap

      public Map<String,Integer> getHeaderMap()
      Returns a copy of the header map that iterates in column order.

      The map keys are column names. The map values are 0-based indices.

      Returns:
      a copy of the header map that iterates in column order.
    • getRecordNumber

      public long getRecordNumber()
      Returns the current record number in the input stream.

      ATTENTION: If your CSV input has multi-line values, the returned number does not correspond to the line number.

      Returns:
      current record number
    • getRecords

      public List<org.apache.commons.csv.CSVRecord> getRecords() throws IOException
      Parses the CSV input according to the given format and returns the content as a list of CSVRecords.

      The returned content starts at the current parse-position in the stream.

      Returns:
      list of CSVRecords, may be empty
      Throws:
      IOException - on parse error or input read-failure
    • isClosed

      public boolean isClosed()
      Gets whether this parser is closed.
      Returns:
      whether this parser is closed.
    • iterator

      public Iterator<org.apache.commons.csv.CSVRecord> iterator()
      Returns an iterator on the records.

      IOExceptions occurring during the iteration are wrapped in a RuntimeException. If the parser is closed a call to next() will throw a NoSuchElementException.

      Specified by:
      iterator in interface Iterable<org.apache.commons.csv.CSVRecord>