- Notifications
You must be signed in to change notification settings - Fork 117
Add tf.node.decodeImage #285
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really nice - thanks for tackling this. Just a handful of comments (mostly nits)
Reviewable status:
complete! 1 of 1 approvals obtained (waiting on @dsmilkov, @kangyizhang, and @nkreeger)
.npmignore, line 4 at r1 (raw file):
Quoted 4 lines of code…
src/**/*_test.jpeg src/**/*_test.bmp src/**/*_test.png src/**/*_test.gif
Any details on the origins of the files - do we have permissions to use them?
package.json, line 69 at r1 (raw file):
"package_name": "CPU-darwin-1.2.4.tar.gz"
I've noticed this too - when working on Mac OS - this gets reset. Something for us to think about how to fix in a future PR.
src/decode_image.ts, line 27 at r1 (raw file):
export function decodeJpeg(
If we're going to export these methods - we should have some docs on these. Some of these in-params seem like they might need explanation.
src/decode_image.ts, line 30 at r1 (raw file):
const backend = nodeBackend();
nit:
nodeBackend().decode*() seems easier than declaring a const for these straight-forward methods.
src/decode_image.ts, line 54 at r1 (raw file):
Tensor
4D tensor right?
src/decode_image.ts, line 54 at r1 (raw file):
int32
of dtype int32
.
Looking at https://www.tensorflow.org/api_docs/python/tf/io/decode_image
It looks like they have the option to specify the dtype - is this something we want to consider here (default to int32).
src/decode_image.ts, line 118 at r1 (raw file):
if (buf.length > 3 && buf[0] === 255 && buf[1] === 216 && buf[2] === 255) {
Nice work on this - is there a reference you have? If so a URL link here would be great.
src/nodejs_kernel_backend.ts, line 111 at r1 (raw file):
use
s/uses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 15 files at r1.
Reviewable status:complete! 1 of 1 approvals obtained (waiting on @doc, @dsmilkov, and @kangyizhang)
src/decode_image.ts, line 27 at r1 (raw file):
Previously, nkreeger (Nick Kreeger) wrote…
export function decodeJpeg(
If we're going to export these methods - we should have some docs on these. Some of these in-params seem like they might need explanation.
+1, you can use @doc for these and they'll show up in API docs.
src/decode_image.ts, line 85 at r1 (raw file):
tryRecoverTruncated = false, acceptableFraction = 1, dctMethod = ''): Tensor4D { const image = fs.readFileSync(path);
I feel like we should use the async versions of these and make decodeImage async, wydt?
src/decode_image.ts, line 88 at r1 (raw file):
const buf = Buffer.from(image); const imageType = getImageType(buf); const backend = nodeBackend();
what happens here if the nodeBackend isnt the active backend?
src/decode_image.ts, line 105 at r1 (raw file):
.expandDims(0); case PNG: return backend.decodePng(uint8array, channels).toInt().expandDims(0);
this is leaking 2 tensors -- can you add a unit test for measuring memory usage? and then make sure we clean up here?
src/decode_image.ts, line 116 at r1 (raw file):
/** Helper function to get image type based on starting bytes of the file. */ function getImageType(buf: Buffer): string {
make this an enum
src/decode_image_test.ts, line 3 at r1 (raw file):
/** * @license * Copyright 2018 Google Inc. All Rights Reserved.
2019
src/decode_image_test.ts, line 19 at r1 (raw file):
import {test_util} from '@tensorflow/tfjs-core'; import {decodeImage} from './decode_image';
import this from the public API (from index) so that you are testing the public API
src/image_png_test.png, line 0 at r1 (raw file):
what about putting these in a separate test_images directory so they don't litter this directory?
src/nodejs_kernel_backend.ts, line 1600 at r1 (raw file):
} decodeJpeg(
these aren't going to make it into the interface so can we put this somewhere not on the backend, maybe just directly in the functions in the other files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 1 approvals obtained (waiting on @doc, @dsmilkov, @kangyizhang, and @nsthorat)
src/decode_image.ts, line 85 at r1 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
I feel like we should use the async versions of these and make decodeImage async, wydt?
Yeah good call - let's convert this to async.
src/decode_image.ts, line 88 at r1 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
what happens here if the nodeBackend isnt the active backend?
We can add a helper method in op_utils.ts
to ensure that the right backend is loaded. https://github.com/tensorflow/tfjs-node/blob/master/src/ops/op_utils.ts
FWIW - tensorboard has the same problem:
Line 27 in 4c5bde8
this.backend = nodeBackend(); |
src/nodejs_kernel_backend.ts, line 1600 at r1 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
these aren't going to make it into the interface so can we put this somewhere not on the backend, maybe just directly in the functions in the other files?
Not really w/o having to refactor a bunch of the internal helper methods. If we want to do this - let's punt for another PR (and file an issue).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 1 approvals obtained (waiting on @doc, @dsmilkov, @kangyizhang, @nkreeger, and @nsthorat)
src/nodejs_kernel_backend.ts, line 1600 at r1 (raw file):
Previously, nkreeger (Nick Kreeger) wrote…
Not really w/o having to refactor a bunch of the internal helper methods. If we want to do this - let's punt for another PR (and file an issue).
sounds good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 1 approvals obtained (waiting on @doc, @dsmilkov, @nkreeger, and @nsthorat)
.npmignore, line 4 at r1 (raw file):
Previously, nkreeger (Nick Kreeger) wrote…
src/**/*_test.jpeg src/**/*_test.bmp src/**/*_test.png src/**/*_test.gif
Any details on the origins of the files - do we have permissions to use them?
I drew these by filling each pixels with a RGB value so the tests could check decoded pixel values. So permission is not an issue.
package.json, line 69 at r1 (raw file):
Previously, nkreeger (Nick Kreeger) wrote…
"package_name": "CPU-darwin-1.2.4.tar.gz"
I've noticed this too - when working on Mac OS - this gets reset. Something for us to think about how to fix in a future PR.
Yea. I have some thoughts and I'll try to fix it in another PR. Filed tensorflow/tfjs#1756 to track.
src/decode_image.ts, line 27 at r1 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
+1, you can use @doc for these and they'll show up in API docs.
Done. I think it is good to export these ops to align with Python/C++ API. Docs added.
src/decode_image.ts, line 30 at r1 (raw file):
Previously, nkreeger (Nick Kreeger) wrote…
const backend = nodeBackend();
nit:
nodeBackend().decode*() seems easier than declaring a const for these straight-forward methods.
Done.
src/decode_image.ts, line 54 at r1 (raw file):
Previously, nkreeger (Nick Kreeger) wrote…
Tensor
4D tensor right?
Thanks for checking this. After checking python/c++ op, it should be 3D tensor for decode jpeg/png/bmp, and 4D tensor for gif.
src/decode_image.ts, line 54 at r1 (raw file):
Previously, nkreeger (Nick Kreeger) wrote…
int32
of dtype
int32
.Looking at https://www.tensorflow.org/api_docs/python/tf/io/decode_image
It looks like they have the option to specify the dtype - is this something we want to consider here (default to int32).
I don't think we need to specify dtype. The dtype to specify in python is either uint8 (0255) or uint16 (065535), the pixel values in either dtype are the same and the only difference is bits number. In tfjs int32 is enough to hold the pixel values.
src/decode_image.ts, line 85 at r1 (raw file):
Previously, nkreeger (Nick Kreeger) wrote…
Yeah good call - let's convert this to async.
Done.
src/decode_image.ts, line 88 at r1 (raw file):
Previously, nkreeger (Nick Kreeger) wrote…
We can add a helper method in
op_utils.ts
to ensure that the right backend is loaded. https://github.com/tensorflow/tfjs-node/blob/master/src/ops/op_utils.tsFWIW - tensorboard has the same problem:
Line 27 in 4c5bde8
this.backend = nodeBackend();
Done. Added ensureTensorflowBackend()
and use it in tensorflow as well.
src/decode_image.ts, line 105 at r1 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
this is leaking 2 tensors -- can you add a unit test for measuring memory usage? and then make sure we clean up here?
Good catch. Wrapped this in tidy and added memory usage in all tests.
src/decode_image.ts, line 116 at r1 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
make this an enum
Done.
src/decode_image.ts, line 118 at r1 (raw file):
Previously, nkreeger (Nick Kreeger) wrote…
if (buf.length > 3 && buf[0] === 255 && buf[1] === 216 && buf[2] === 255) {
Nice work on this - is there a reference you have? If so a URL link here would be great.
Reference for magic number and same logic in TensorFlow Core added.
src/decode_image_test.ts, line 3 at r1 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
2019
Done.
src/decode_image_test.ts, line 19 at r1 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
import this from the public API (from index) so that you are testing the public API
Done.
src/nodejs_kernel_backend.ts, line 111 at r1 (raw file):
Previously, nkreeger (Nick Kreeger) wrote…
use
s/uses
Done.
src/image_png_test.png, line at r1 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
what about putting these in a separate test_images directory so they don't litter this directory?
Great idea, it also make it easy to add fields in npmignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice work! Yay! Few comments mostly regarding the API
Reviewed 2 of 15 files at r1, 5 of 15 files at r2.
Reviewable status:complete! 1 of 1 approvals obtained (waiting on @doc, @dsmilkov, @kangyizhang, @nkreeger, and @nsthorat)
src/decode_image.ts, line 60 at r2 (raw file):
*/ /** * @doc {heading: 'Node.js', namespace: 'node'}
change heading to "Images". This is used to group similar ops together in our API docs under the same header. For example: https://js.tensorflow.org/api/latest/#Operations-Images
src/decode_image.ts, line 88 at r2 (raw file):
*/ /** * @doc {heading: 'Node.js', namespace: 'node'}
same here, heading "Images"
src/decode_image.ts, line 136 at r2 (raw file):
/** * Detects whether an image is a BMP, GIF, JPEG, or PNG, and performs the
nit: you don't have to explain the internal detail in the public facing API. Just say it decodes an image give the encoded bytes. Can take bmp, gif, jpeg or png. (no need to say it detects the type and calls internal ops)
src/decode_image.ts, line 170 at r2 (raw file):
*/ export async function decodeImage( path: string, channels = 0, ratio = 1, fancyUpscaling = true,
Let's align with the python API: https://www.tensorflow.org/api_docs/python/tf/io/decode_image
Couple things to note:
- Let's take bytes instead of a filesystem path. This is more flexible since it allows the user to read those images from anywhere (including memory)
- The JPG specific params are not exposed in tf.io.decode_image, so let's not expose it here
- Let's expose the output dtype, which at this point we only support int32, but when we add the uint8 dtype we don't want to break API.
Same for the other methods elsewhere.
src/decode_image_test.ts, line 23 at r2 (raw file):
describe('decode images', () => { it('decode png', async () => {
nice mem usage tests!
src/decode_image_test.ts, line 45 at r2 (raw file):
test_util.expectArraysEqual(await imageTensor.data(), [130, 50, 59, 124]); expect(memory().numTensors).toBe(beforeNumTensors + 1); });
add a test that passes invalid bytes (not an image) and expect that it throws
src/decode_image_test.ts, line 176 at r2 (raw file):
} catch (err) { expect(err.message) .toBe('Expect the current backend to be tensorflow, but got cpu');
nit: wrap tensorflow and cpu in quotes to separate the backend name from the error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 1 approvals obtained (waiting on @doc, @dsmilkov, @kangyizhang, @nkreeger, and @nsthorat)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 1 approvals obtained (waiting on @doc, @dsmilkov, @nkreeger, and @nsthorat)
src/decode_image.ts, line 60 at r2 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
change heading to "Images". This is used to group similar ops together in our API docs under the same header. For example: https://js.tensorflow.org/api/latest/#Operations-Images
Done.
src/decode_image.ts, line 88 at r2 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
same here, heading "Images"
Done.
src/decode_image.ts, line 136 at r2 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
nit: you don't have to explain the internal detail in the public facing API. Just say it decodes an image give the encoded bytes. Can take bmp, gif, jpeg or png. (no need to say it detects the type and calls internal ops)
Done.
src/decode_image.ts, line 170 at r2 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
Let's align with the python API: https://www.tensorflow.org/api_docs/python/tf/io/decode_image
Couple things to note:
- Let's take bytes instead of a filesystem path. This is more flexible since it allows the user to read those images from anywhere (including memory)
- The JPG specific params are not exposed in tf.io.decode_image, so let's not expose it here
- Let's expose the output dtype, which at this point we only support int32, but when we add the uint8 dtype we don't want to break API.
Same for the other methods elsewhere.
Done.
src/decode_image_test.ts, line 45 at r2 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
add a test that passes invalid bytes (not an image) and expect that it throws
Done
src/decode_image_test.ts, line 176 at r2 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
nit: wrap tensorflow and cpu in quotes to separate the backend name from the error message.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! LGTM (tiny comments)
Reviewed 1 of 15 files at r1, 8 of 15 files at r2, 4 of 4 files at r3.
Reviewable status:complete! 2 of 1 approvals obtained (waiting on @doc, @dsmilkov, @kangyizhang, @nkreeger, and @nsthorat)
src/decode_image.ts, line 81 at r3 (raw file):
* 3: output an RGB image. * 4: output an RGBA image. * @param dtype The desired DType of the returned Tensor. Now it could only be
Maybe rephrase as "The data type of the result. Only int32
is supported at this time." (same for the other methods below).
src/decode_image.ts, line 138 at r3 (raw file):
/** * Decode an image file in provided path to a 3D or 4D Tensor of dtype `int32`.
Given the encoded bytes of an image, it returns a 3D or 4D tensor of the decoded image. Supports BMP, GIF, JPEG and PNG formats.
src/decode_image.ts, line 183 at r3 (raw file):
return decodeGif(content); } else { // If not to expand animations, take first fram of the gif and return as
typo: first frame
src/decode_image.ts, line 192 at r3 (raw file):
const newShape: [number, number, number] = shape as [number, number, number]; return gifTensor.slice(0, 1).reshape(newShape) as Tensor3D;
you can avoid doing shape manipulations, if you do
gifTensor.slice(0, 1).squeeze(0)
. Because of this you could squeeze the entire logic with branching in 2 lines:
const img = decodeGif(content); return expandAnimations ? img : img.slice(0, 1).squeeze(0);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 2 of 1 approvals obtained (waiting on @doc, @kangyizhang, @nkreeger, and @nsthorat)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review!
Reviewable status:
complete! 2 of 1 approvals obtained (waiting on @doc, @nkreeger, and @nsthorat)
src/decode_image.ts, line 81 at r3 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
Maybe rephrase as "The data type of the result. Only
int32
is supported at this time." (same for the other methods below).
Done.
src/decode_image.ts, line 138 at r3 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
Given the encoded bytes of an image, it returns a 3D or 4D tensor of the decoded image. Supports BMP, GIF, JPEG and PNG formats.
Done.
src/decode_image.ts, line 183 at r3 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
typo: first frame
Done.
src/decode_image.ts, line 192 at r3 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
you can avoid doing shape manipulations, if you do
gifTensor.slice(0, 1).squeeze(0)
. Because of this you could squeeze the entire logic with branching in 2 lines:const img = decodeGif(content); return expandAnimations ? img : img.slice(0, 1).squeeze(0);
Done. Thanks for pointing this out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 2 of 1 approvals obtained (waiting on @doc, @nkreeger, and @nsthorat)
src/decode_image.ts, line 192 at r3 (raw file):
Previously, kangyizhang (Kangyi Zhang) wrote…
Done. Thanks for pointing this out!
Since you have the ternary check here, remove the if (expandAnimations) above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 2 of 1 approvals obtained (waiting on @doc, @nkreeger, and @nsthorat)
src/decode_image.ts, line 192 at r3 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
Since you have the ternary check here, remove the if (expandAnimations) above
Ah nice catch, thanks.
Add tf.node.decodeImage(), fix tensorflow/tfjs#1416
it detects image file type (BMP, GIF, JPEG, or PNG) based on the magic number of the byte chunk and performs the appropriate operation (decodePng, decodeJpeg, decodeBmp, decodeGif) to convert the provided file into a Tensor of int32. TensorFlow C code is also using the magic number to classify file type.
add underlying decodePng, decodeJpeg, decodeBmp, decodeGif which create a 3D or 4D Tensor from encoded image in Uint8Array format.
TensorFlow uses
uint8
|uint16
as dtype of image tensor, which is not supported in TFJS yet. Casting the dtype to int32 which is general used dtype of image tensor in tfjs.This change is