<dependency> <groupId>com.github.marschall</groupId> <artifactId>line-parser</artifactId> <version>0.5.0</version> </dependency>An mmap() based line parser for cases when:
- the start byte position of a line in the file is required
- the length in bytes of a line is required
- only a few character of every line are required
In these cases this library can theoretically be more efficient than BufferedReader because:
- the copy operations of buffered IO are avoided
- the allocation and resizing of an intermediate
StringBufferis avoided - the allocation of the final
Stringis avoided, only the required substrings are allocated
The performance may still be slower than a than BufferedReader based approach but it should consume much less memory bandwidth and produce only a fraction of the garbage.
As this project gives you a CharSequence instead of a String you may want to have a look at the charsequences which gives you some the String convenience methods while avoiding allocation.
- the main parsing loop is likely to benefit from on-stack replacement (OSR)
- if you're using UTF-8 with a BOM then the BOM is returned as well
- if you're using UTF-16 with a BOM then the BOM is returned as well
- the library runs on Java 8 but is also a Java 9 module that only requires the
jdk.unsupportedmodule besides thejava.basemodule
LineParser parser = new LineParser(); parser.forEach(path, cs, (line) -> { System.out.printf("[%d,%d]%s%n", line.getOffset(), line.getLength(), line.getContent()); });