Releases: vincentlaucsb/csv-parser
Releases · vincentlaucsb/csv-parser
CSV Parser 2.0.1
- Made parsing CSV files without header rows more convenient
- Fixed a compilation error with
std::back_inserter
on some systems
CSV Parser 2.0.0
- Parser now uses memory-mapped IO for reading from disk thanks to
mio
- CSV files are read in smaller chunks to reduce memory footprint (but parsing is significantly faster)
CSVReader::read_row()
(andCSVReader::iterator
) no longer blocksCSVReader::read_csv()
, i.e. we can now simultaneously work on CSV data while reading more rows- Parser internals completely rewritten to use more efficient and easier to maintain/debug data structures
- Fixed bug where single column files could not be parsed
- Fixed errors with parsing empty files
CSVWriter::write_row()
now works withstd::array
CSV Parser 2.0 Beta: >300 MBps Edition
- Parser now uses memory-mapped IO for reading from disk
- On Windows, parser may map entire file into memory or mmap chunks of file iteratively based on available RAM (will extend to all OSes)
- Parser internals completely rewritten to use more efficient and easier to maintain/debug data structures
- New algorithm involves minimal copying
- Fixed bug where single column files could not be parsed
- Fixed errors with parsing empty files
Fixed memory errors when parsing large files
- Fixed issue with incorrect usage of
string_view
that led to memory errors when parsing large files such as the 1.4GB Craigslist vehicles dataset #90 - Added ability to have no quote character #83
- Changed
VariableColumnPolicy::IGNORE
toIGNORE_ROW
to avoid clashing withIGNORE
macro as defined byWinBase.h
#96
Fixed bug with parsing very long rows
- Fixed bug with parsing very long rows (as reported in #92) when the length of the row was greater than 2^16 (the limit of
unsigned short
)- All instances of
unsigned short
have been replaced byinternals::StrBufferPos
(size_t
) thus giving this parser the theoretical capability of parsing rows that are 2^64 characters long
- All instances of
- Fixed bug recognizing numbers in e-notation when the base did not have a decimal, e.g.
1E-06
Fixed bug with whitespace trimming when a field is entirely whitespace
Fixes incorrect CSV parsing when whitespace trimming is enabled and a field is composed entirely of whitespace characters as reported in #85
First class handling of variable column CSVs
- The behavior for parsing variable-column CSV files can now be simply defined using
CSVFormat::variable_columns()
- Variable-column rows can be kept or silently dropped (default), or result in an error being thrown
CSVReader::bad_row_handler()
has been removed
- Many annoying clang/gcc warning messages fixed (thanks rpavlik!)
- CSV guessing implementation has been simplified (
CSVGuesser
is also gone now)
Fixed bug where get<>() threw incorrect overflow errors with unsigned integers
Fixed bug reported in #73
Fixed Issue with Leading Comments
Fixed issue described by #67 where leading comments got concatenated to the first column name
Fixed bug reading rows that begin with empty fields
Fix bug when CSV rows have leading empty fields (#57) * Fixed bug --> all tests passing :bug: * Fixed some MSVC warnings