-
- Notifications
You must be signed in to change notification settings - Fork 33.7k
buffer: add buffer.transcode #9038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
lib/internal/buffer.js Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about possibly placing this in lib/internal/buffer-transcode.js and conditionally require()'ing it. purely cosmetic though, to prevent an extra level of indent. or you could just return early. :)
if (!process.binding('config').hasIntl) return; lib/internal/buffer.js Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We going to have a complaint about not supporting SharedArrayBuffer for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually, perhaps. Not too worried about that for now.
| Updated |
doc/api/buffer.md Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe give an example of transcoding is not possible?
src/node_i18n.cc Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm – this could be a ThrowICUError function, right?
src/node_i18n.cc Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine but at some point this might become a member of the MaybeStackBuffer class? I realize that would conflict a bit with the MaybeStackBuffer<UChar> overload, maybe leave a TODO here?
lib/internal/buffer.js Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that final to residue from editing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, slight brain malfunction there I think ;-)
lib/internal/buffer.js Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhhh this returns a string or a Buffer depending on the target encoding? I don’t think binary-to-text encodings should be allowed here, .toString() is the right method for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you're right. I'll pull these back out.
5b65f74 to 62423a0 Compare | @addaleax ... updated! PTAL |
addaleax left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice!
src/node_i18n.cc Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this use SwapBytes16?
src/node_i18n.cc Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(ditto)
src/node_i18n.cc Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 1024 seem kind of magic here, although I realize that is largely my fault. 😄 (Not sure if there’s anything to do about that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fixed now!
lib/internal/buffer.js Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style: s/from_enc/fromEncoding/ and s/to_enc/toEncoding/. Ditto for cnv_from and cnv_to.
src/node_i18n.cc Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize you adapted this code from elsewhere but using snprintf() to format the error message will be much more efficient.
src/node_i18n.cc Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a if (e.Empty()) return Local<Value>();.
src/node_i18n.cc Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be if (!ret.Empty()) buf->Release(); - it's leaking memory now when the buffer can't be created.
src/node_i18n.cc Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
src/node_i18n.cc Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above: strict aliasing violation and prone to crashing.
src/node_i18n.cc Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like there is ample opportunity to share code between Ucs2FromUtf8 and Utf8FromUcs2, they are 80% identical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps. For now I'm more inclined to keep these separate as it makes finding and tweaking bugs a bit easier. I'll take another pass in a separate PR to condense things down.
src/util.h Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this also reset length_?
src/util.h Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you move this into a common header, you might want to give it a slightly less generic name; e.g. SPREAD_BUFFER_ARG.
tools/icu/icu-generic.gyp Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just remove the 'defines' block instead of commenting it out.
9e0464b to 2da087e Compare |
|
|
|
eecad5e to d7340ec Compare | CI looks good. @bnoordhuis PTAL... LGTY? |
| ping @bnoordhuis |
src/node_i18n.cc Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you're returning empty handles as an indicator it's probably be more "V8-ish" have the return signature as a MaybeLocal<Value> instead. been trying to do that in other locations myself.
src/node_i18n.cc Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
creating Local<Value>'s but there's no HandleScope. if this callback is expected to always be called within an existing HandleScope (like MakeCallback), mind putting a comment at the top. also like MakeCallback (see src/node.h).
src/node_i18n.cc Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we know this is a v8::Object then can use e.As<Object>(). that's also more explicit that no extra handle is being created.
src/node_i18n.cc Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new v8::Maybe<T> API for v8::Object::Set() is annoying and ugly, but if if we're going to use some of the new API might as well use all of it.
src/node_i18n.cc Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you're manipulating the original memory, why bother take a copy?
src/node_i18n.cc Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this operation fails, do we want to abort or throw?
src/node_i18n.cc Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for future performance enhancement, detect the alignment of the pointer and perform as many swaps that can be done in a single go.
lib/internal/buffer.js Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm a little confused by this whole object, but right here if we're converting ascii to latin1 shouldn't we be passing 'latin1' as the encoding argument to Buffer.from()?
6f70820 to 65144bb Compare | @bnoordhuis ... ok, reworked the implementation with an eye towards simplification and reducing duplication. PTAL |
| Thank you for the follow up review @addaleax. Will get this landed tomorrow if there are no further objections. |
Add buffer.transcode(source, from, to) method. Primarily uses ICU to transcode a buffer's content from one of Node.js' supported encodings to another. Originally part of a proposal to add a new unicode module. Decided to refactor the approach towrds individual PRs without a new module. Refs: nodejs#8075
65144bb to 4d7472b Compare | New CI run after squashing: https://ci.nodejs.org/job/node-test-pull-request/4665/ |
| green except for unrelated failures. landing |
Add buffer.transcode(source, from, to) method. Primarily uses ICU to transcode a buffer's content from one of Node.js' supported encodings to another. Originally part of a proposal to add a new unicode module. Decided to refactor the approach towrds individual PRs without a new module. Refs: #8075 PR-URL: #9038 Reviewed-By: Anna Henningsen <anna@addaleax.net>
| Landed in e8eaaa7 |
Add buffer.transcode(source, from, to) method. Primarily uses ICU to transcode a buffer's content from one of Node.js' supported encodings to another. Originally part of a proposal to add a new unicode module. Decided to refactor the approach towrds individual PRs without a new module. Refs: #8075 PR-URL: #9038 Reviewed-By: Anna Henningsen <anna@addaleax.net>
| If this is backported to any of the other release lines, it needs to come with #9838 |
Checklist
make -j8 test(UNIX), orvcbuild test nosign(Windows) passesAffected core subsystem(s)
buffer
Description of change
Add buffer.transcode(source, from, to) method. Primarily uses ICU to transcode a buffer's content from one of Node.js' supported encodings to another.
Originally part of a proposal to add a new unicode module. Decided to refactor the approach towrds individual PRs without a new module.
Refs: #8075
/cc @trevnorris @addaleax