Skip to content

Conversation

@gcanat
Copy link
Contributor

@gcanat gcanat commented Sep 23, 2024

On my machine, time decreased by 75% on 1Gb utf8.txt, from 7.4s to 1.85s.

@jgarzik
Copy link
Contributor

jgarzik commented Sep 23, 2024

Nice.

tiny comment: No need to pass global variable table as a function parameter.

Will review and test in depth tomorrow.

@jgarzik
Copy link
Contributor

jgarzik commented Sep 24, 2024

Does this work if a multi-byte character straddles the edge of two input buffers?
i.e. First portion of char is input via file.read(), and 2nd portion of char is input via 2nd call to file.read()?

@gcanat
Copy link
Contributor Author

gcanat commented Sep 24, 2024

Well I just tested with a 200Mb file filled with these 2 chars 生履, with no space and I get the same result as GNU wc -m, wc2 -m and wz -c.

@jgarzik jgarzik merged commit 21e1ef1 into rustcoreutils:main Sep 25, 2024
@gcanat gcanat deleted the wc_chars branch September 25, 2024 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants