Add tf.node.decodeImage #285

kangyizhang · 2019-07-18T00:39:06Z

Add tf.node.decodeImage(), fix tensorflow/tfjs#1416

it detects image file type (BMP, GIF, JPEG, or PNG) based on the magic number of the byte chunk and performs the appropriate operation (decodePng, decodeJpeg, decodeBmp, decodeGif) to convert the provided file into a Tensor of int32. TensorFlow C code is also using the magic number to classify file type.
add underlying decodePng, decodeJpeg, decodeBmp, decodeGif which create a 3D or 4D Tensor from encoded image in Uint8Array format.
TensorFlow uses uint8|uint16 as dtype of image tensor, which is not supported in TFJS yet. Casting the dtype to int32 which is general used dtype of image tensor in tfjs.

This change is

nkreeger

This is really nice - thanks for tackling this. Just a handful of comments (mostly nits)

Reviewable status: complete! 1 of 1 approvals obtained (waiting on @dsmilkov, @kangyizhang, and @nkreeger)

.npmignore, line 4 at r1 (raw file):

Quoted 4 lines of code…

src/**/*_test.jpeg src/**/*_test.bmp src/**/*_test.png src/**/*_test.gif

Any details on the origins of the files - do we have permissions to use them?

package.json, line 69 at r1 (raw file):

"package_name": "CPU-darwin-1.2.4.tar.gz"

I've noticed this too - when working on Mac OS - this gets reset. Something for us to think about how to fix in a future PR.

src/decode_image.ts, line 27 at r1 (raw file):

export function decodeJpeg(

If we're going to export these methods - we should have some docs on these. Some of these in-params seem like they might need explanation.

src/decode_image.ts, line 30 at r1 (raw file):

const backend = nodeBackend();

nit:

nodeBackend().decode*() seems easier than declaring a const for these straight-forward methods.

src/decode_image.ts, line 54 at r1 (raw file):

Tensor

4D tensor right?

src/decode_image.ts, line 54 at r1 (raw file):

int32

of dtype int32.

Looking at https://www.tensorflow.org/api_docs/python/tf/io/decode_image

It looks like they have the option to specify the dtype - is this something we want to consider here (default to int32).

src/decode_image.ts, line 118 at r1 (raw file):

if (buf.length > 3 && buf[0] === 255 && buf[1] === 216 && buf[2] === 255) {

Nice work on this - is there a reference you have? If so a URL link here would be great.

src/nodejs_kernel_backend.ts, line 111 at r1 (raw file):

use

s/uses

nsthorat

Reviewed 6 of 15 files at r1.
Reviewable status: complete! 1 of 1 approvals obtained (waiting on @doc, @dsmilkov, and @kangyizhang)

src/decode_image.ts, line 27 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

export function decodeJpeg( 
If we're going to export these methods - we should have some docs on these. Some of these in-params seem like they might need explanation.

+1, you can use @doc for these and they'll show up in API docs.

src/decode_image.ts, line 85 at r1 (raw file):

 tryRecoverTruncated = false, acceptableFraction = 1, dctMethod = ''): Tensor4D { const image = fs.readFileSync(path);

I feel like we should use the async versions of these and make decodeImage async, wydt?

src/decode_image.ts, line 88 at r1 (raw file):

 const buf = Buffer.from(image); const imageType = getImageType(buf); const backend = nodeBackend();

what happens here if the nodeBackend isnt the active backend?

src/decode_image.ts, line 105 at r1 (raw file):

 .expandDims(0); case PNG: return backend.decodePng(uint8array, channels).toInt().expandDims(0);

this is leaking 2 tensors -- can you add a unit test for measuring memory usage? and then make sure we clean up here?

src/decode_image.ts, line 116 at r1 (raw file):

/** Helper function to get image type based on starting bytes of the file. */ function getImageType(buf: Buffer): string {

make this an enum

src/decode_image_test.ts, line 3 at r1 (raw file):

/** * @license * Copyright 2018 Google Inc. All Rights Reserved.

2019

src/decode_image_test.ts, line 19 at r1 (raw file):

import {test_util} from '@tensorflow/tfjs-core'; import {decodeImage} from './decode_image';

import this from the public API (from index) so that you are testing the public API

src/image_png_test.png, line 0 at r1 (raw file):
what about putting these in a separate test_images directory so they don't litter this directory?

src/nodejs_kernel_backend.ts, line 1600 at r1 (raw file):

 } decodeJpeg(

these aren't going to make it into the interface so can we put this somewhere not on the backend, maybe just directly in the functions in the other files?

nkreeger

Reviewable status: complete! 1 of 1 approvals obtained (waiting on @doc, @dsmilkov, @kangyizhang, and @nsthorat)

src/decode_image.ts, line 85 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

I feel like we should use the async versions of these and make decodeImage async, wydt?

Yeah good call - let's convert this to async.

src/decode_image.ts, line 88 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

what happens here if the nodeBackend isnt the active backend?

We can add a helper method in op_utils.ts to ensure that the right backend is loaded. https://github.com/tensorflow/tfjs-node/blob/master/src/ops/op_utils.ts

FWIW - tensorboard has the same problem:

tfjs-node/src/tensorboard.ts

Line 27 in 4c5bde8

this.backend = nodeBackend();

src/nodejs_kernel_backend.ts, line 1600 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

these aren't going to make it into the interface so can we put this somewhere not on the backend, maybe just directly in the functions in the other files?

Not really w/o having to refactor a bunch of the internal helper methods. If we want to do this - let's punt for another PR (and file an issue).

nsthorat

Reviewable status: complete! 1 of 1 approvals obtained (waiting on @doc, @dsmilkov, @kangyizhang, @nkreeger, and @nsthorat)

src/nodejs_kernel_backend.ts, line 1600 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

Not really w/o having to refactor a bunch of the internal helper methods. If we want to do this - let's punt for another PR (and file an issue).

sounds good!

kangyizhang

Reviewable status: complete! 1 of 1 approvals obtained (waiting on @doc, @dsmilkov, @nkreeger, and @nsthorat)

.npmignore, line 4 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

src/**/*_test.jpeg src/**/*_test.bmp src/**/*_test.png src/**/*_test.gif 
Any details on the origins of the files - do we have permissions to use them?

I drew these by filling each pixels with a RGB value so the tests could check decoded pixel values. So permission is not an issue.

package.json, line 69 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

"package_name": "CPU-darwin-1.2.4.tar.gz" 
I've noticed this too - when working on Mac OS - this gets reset. Something for us to think about how to fix in a future PR.

Yea. I have some thoughts and I'll try to fix it in another PR. Filed tensorflow/tfjs#1756 to track.

src/decode_image.ts, line 27 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

+1, you can use @doc for these and they'll show up in API docs.

Done. I think it is good to export these ops to align with Python/C++ API. Docs added.

src/decode_image.ts, line 30 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

const backend = nodeBackend(); 
nit:

nodeBackend().decode*() seems easier than declaring a const for these straight-forward methods.

Done.

src/decode_image.ts, line 54 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

Tensor 
4D tensor right?

Thanks for checking this. After checking python/c++ op, it should be 3D tensor for decode jpeg/png/bmp, and 4D tensor for gif.

src/decode_image.ts, line 54 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

int32 
of dtype int32.

Looking at https://www.tensorflow.org/api_docs/python/tf/io/decode_image

It looks like they have the option to specify the dtype - is this something we want to consider here (default to int32).

I don't think we need to specify dtype. The dtype to specify in python is either uint8 (0~~255) or uint16 (0~~65535), the pixel values in either dtype are the same and the only difference is bits number. In tfjs int32 is enough to hold the pixel values.

src/decode_image.ts, line 85 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

Yeah good call - let's convert this to async.

Done.

src/decode_image.ts, line 88 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

We can add a helper method in op_utils.ts to ensure that the right backend is loaded. https://github.com/tensorflow/tfjs-node/blob/master/src/ops/op_utils.ts

FWIW - tensorboard has the same problem:

tfjs-node/src/tensorboard.ts

Line 27 in 4c5bde8

this.backend = nodeBackend();

Done. Added ensureTensorflowBackend() and use it in tensorflow as well.

src/decode_image.ts, line 105 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

this is leaking 2 tensors -- can you add a unit test for measuring memory usage? and then make sure we clean up here?

Good catch. Wrapped this in tidy and added memory usage in all tests.

src/decode_image.ts, line 116 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

make this an enum

Done.

src/decode_image.ts, line 118 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

if (buf.length > 3 && buf[0] === 255 && buf[1] === 216 && buf[2] === 255) { 
Nice work on this - is there a reference you have? If so a URL link here would be great.

Reference for magic number and same logic in TensorFlow Core added.

src/decode_image_test.ts, line 3 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

2019

Done.

src/decode_image_test.ts, line 19 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

import this from the public API (from index) so that you are testing the public API

Done.

src/nodejs_kernel_backend.ts, line 111 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

use 
s/uses

Done.

src/image_png_test.png, line at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

what about putting these in a separate test_images directory so they don't litter this directory?

Great idea, it also make it easy to add fields in npmignore.

dsmilkov

Really nice work! Yay! Few comments mostly regarding the API

Reviewed 2 of 15 files at r1, 5 of 15 files at r2.
Reviewable status: complete! 1 of 1 approvals obtained (waiting on @doc, @dsmilkov, @kangyizhang, @nkreeger, and @nsthorat)

src/decode_image.ts, line 60 at r2 (raw file):

 */ /** * @doc {heading: 'Node.js', namespace: 'node'}

change heading to "Images". This is used to group similar ops together in our API docs under the same header. For example: https://js.tensorflow.org/api/latest/#Operations-Images

src/decode_image.ts, line 88 at r2 (raw file):

 */ /** * @doc {heading: 'Node.js', namespace: 'node'}

same here, heading "Images"

src/decode_image.ts, line 136 at r2 (raw file):

/** * Detects whether an image is a BMP, GIF, JPEG, or PNG, and performs the

nit: you don't have to explain the internal detail in the public facing API. Just say it decodes an image give the encoded bytes. Can take bmp, gif, jpeg or png. (no need to say it detects the type and calls internal ops)

src/decode_image.ts, line 170 at r2 (raw file):

 */ export async function decodeImage( path: string, channels = 0, ratio = 1, fancyUpscaling = true,

Let's align with the python API: https://www.tensorflow.org/api_docs/python/tf/io/decode_image

Couple things to note:

Let's take bytes instead of a filesystem path. This is more flexible since it allows the user to read those images from anywhere (including memory)
The JPG specific params are not exposed in tf.io.decode_image, so let's not expose it here
Let's expose the output dtype, which at this point we only support int32, but when we add the uint8 dtype we don't want to break API.

Same for the other methods elsewhere.

src/decode_image_test.ts, line 23 at r2 (raw file):

describe('decode images', () => { it('decode png', async () => {

nice mem usage tests!

src/decode_image_test.ts, line 45 at r2 (raw file):

 test_util.expectArraysEqual(await imageTensor.data(), [130, 50, 59, 124]); expect(memory().numTensors).toBe(beforeNumTensors + 1); });

add a test that passes invalid bytes (not an image) and expect that it throws

src/decode_image_test.ts, line 176 at r2 (raw file):

 } catch (err) { expect(err.message) .toBe('Expect the current backend to be tensorflow, but got cpu');

nit: wrap tensorflow and cpu in quotes to separate the backend name from the error message.

dsmilkov

Reviewable status: complete! 1 of 1 approvals obtained (waiting on @doc, @dsmilkov, @kangyizhang, @nkreeger, and @nsthorat)

kangyizhang

Reviewable status: complete! 1 of 1 approvals obtained (waiting on @doc, @dsmilkov, @nkreeger, and @nsthorat)

src/decode_image.ts, line 60 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

change heading to "Images". This is used to group similar ops together in our API docs under the same header. For example: https://js.tensorflow.org/api/latest/#Operations-Images

Done.

src/decode_image.ts, line 88 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

same here, heading "Images"

Done.

src/decode_image.ts, line 136 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

nit: you don't have to explain the internal detail in the public facing API. Just say it decodes an image give the encoded bytes. Can take bmp, gif, jpeg or png. (no need to say it detects the type and calls internal ops)

Done.

src/decode_image.ts, line 170 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

Let's align with the python API: https://www.tensorflow.org/api_docs/python/tf/io/decode_image

Couple things to note:

Let's take bytes instead of a filesystem path. This is more flexible since it allows the user to read those images from anywhere (including memory)

The JPG specific params are not exposed in tf.io.decode_image, so let's not expose it here

Let's expose the output dtype, which at this point we only support int32, but when we add the uint8 dtype we don't want to break API.

Same for the other methods elsewhere.

Done.

src/decode_image_test.ts, line 45 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

add a test that passes invalid bytes (not an image) and expect that it throws

Done

src/decode_image_test.ts, line 176 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

nit: wrap tensorflow and cpu in quotes to separate the backend name from the error message.

Done

dsmilkov

Great work! LGTM (tiny comments)

Reviewed 1 of 15 files at r1, 8 of 15 files at r2, 4 of 4 files at r3.
Reviewable status: complete! 2 of 1 approvals obtained (waiting on @doc, @dsmilkov, @kangyizhang, @nkreeger, and @nsthorat)

src/decode_image.ts, line 81 at r3 (raw file):

 * 3: output an RGB image. * 4: output an RGBA image. * @param dtype The desired DType of the returned Tensor. Now it could only be

Maybe rephrase as "The data type of the result. Only int32 is supported at this time." (same for the other methods below).

src/decode_image.ts, line 138 at r3 (raw file):

/** * Decode an image file in provided path to a 3D or 4D Tensor of dtype `int32`.

Given the encoded bytes of an image, it returns a 3D or 4D tensor of the decoded image. Supports BMP, GIF, JPEG and PNG formats.

src/decode_image.ts, line 183 at r3 (raw file):

 return decodeGif(content); } else { // If not to expand animations, take first fram of the gif and return as

typo: first frame

src/decode_image.ts, line 192 at r3 (raw file):

 const newShape: [number, number, number] = shape as [number, number, number]; return gifTensor.slice(0, 1).reshape(newShape) as Tensor3D;

you can avoid doing shape manipulations, if you do
gifTensor.slice(0, 1).squeeze(0). Because of this you could squeeze the entire logic with branching in 2 lines:

const img = decodeGif(content); return expandAnimations ? img : img.slice(0, 1).squeeze(0);

dsmilkov

Reviewable status: complete! 2 of 1 approvals obtained (waiting on @doc, @kangyizhang, @nkreeger, and @nsthorat)

kangyizhang

Thanks for the review!

Reviewable status: complete! 2 of 1 approvals obtained (waiting on @doc, @nkreeger, and @nsthorat)

src/decode_image.ts, line 81 at r3 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

Maybe rephrase as "The data type of the result. Only int32 is supported at this time." (same for the other methods below).

Done.

src/decode_image.ts, line 138 at r3 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

Given the encoded bytes of an image, it returns a 3D or 4D tensor of the decoded image. Supports BMP, GIF, JPEG and PNG formats.

Done.

src/decode_image.ts, line 183 at r3 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

typo: first frame

Done.

src/decode_image.ts, line 192 at r3 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

you can avoid doing shape manipulations, if you do
gifTensor.slice(0, 1).squeeze(0). Because of this you could squeeze the entire logic with branching in 2 lines:
const img = decodeGif(content); return expandAnimations ? img : img.slice(0, 1).squeeze(0);

Done. Thanks for pointing this out!

dsmilkov

Reviewable status: complete! 2 of 1 approvals obtained (waiting on @doc, @nkreeger, and @nsthorat)

src/decode_image.ts, line 192 at r3 (raw file):

Previously, kangyizhang (Kangyi Zhang) wrote…

Done. Thanks for pointing this out!

Since you have the ternary check here, remove the if (expandAnimations) above

kangyizhang

Reviewable status: complete! 2 of 1 approvals obtained (waiting on @doc, @nkreeger, and @nsthorat)

src/decode_image.ts, line 192 at r3 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

Since you have the ternary check here, remove the if (expandAnimations) above

Ah nice catch, thanks.

Kangyi Zhang added 5 commits July 17, 2019 17:07

Add decodeImage

5cea1f8

Format

f272faa

Fix lint

4fc0780

format

2e67396

fix lint

d543fb6

kangyizhang requested review from dsmilkov and nkreeger July 18, 2019 00:39

nkreeger approved these changes Jul 18, 2019

View reviewed changes

nsthorat reviewed Jul 18, 2019

View reviewed changes

nkreeger reviewed Jul 18, 2019

View reviewed changes

nsthorat reviewed Jul 18, 2019

View reviewed changes

kangyizhang added 2 commits July 18, 2019 12:12

address comments

e14654e

fix lint

cf170b8

kangyizhang commented Jul 18, 2019

View reviewed changes

dsmilkov suggested changes Jul 19, 2019

View reviewed changes

update api;

bbce71d

kangyizhang commented Jul 19, 2019

View reviewed changes

dsmilkov approved these changes Jul 19, 2019

View reviewed changes

fix nits

0699e9a

kangyizhang commented Jul 19, 2019

View reviewed changes

dsmilkov reviewed Jul 19, 2019

View reviewed changes

fix nit

449b2a1

kangyizhang commented Jul 19, 2019

View reviewed changes

kangyizhang merged commit 8b8e99b into master Jul 19, 2019

kangyizhang deleted the kangyi-image branch July 23, 2019 23:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add tf.node.decodeImage #285

Add tf.node.decodeImage #285

Uh oh!

kangyizhang commented Jul 18, 2019 •

edited

Loading

nkreeger left a comment

nsthorat left a comment

nkreeger left a comment

nsthorat left a comment

kangyizhang left a comment

dsmilkov left a comment

dsmilkov left a comment

kangyizhang left a comment

dsmilkov left a comment

dsmilkov left a comment

kangyizhang left a comment

dsmilkov left a comment

kangyizhang left a comment

Labels

4 participants

Add tf.node.decodeImage #285

Add tf.node.decodeImage #285

Uh oh!

Conversation

kangyizhang commented Jul 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

nkreeger left a comment

Choose a reason for hiding this comment

nsthorat left a comment

Choose a reason for hiding this comment

nkreeger left a comment

Choose a reason for hiding this comment

nsthorat left a comment

Choose a reason for hiding this comment

kangyizhang left a comment

Choose a reason for hiding this comment

dsmilkov left a comment

Choose a reason for hiding this comment

dsmilkov left a comment

Choose a reason for hiding this comment

kangyizhang left a comment

Choose a reason for hiding this comment

dsmilkov left a comment

Choose a reason for hiding this comment

dsmilkov left a comment

Choose a reason for hiding this comment

kangyizhang left a comment

Choose a reason for hiding this comment

dsmilkov left a comment

Choose a reason for hiding this comment

kangyizhang left a comment

Choose a reason for hiding this comment

Labels

4 participants

kangyizhang commented Jul 18, 2019 •

edited

Loading