This document summarizes a project that used a deep learning model to predict depth images from single RGB images. It discusses existing solutions using stereo cameras or Kinect devices. The project used the NYU Depth V2 dataset, splitting it into training, validation, and test sets. It implemented a model based on previous work, training it on RGB-D image pairs for 35 epochs but achieving only moderate results due to limited training data. The code and results are available online for further exploration.
Depth Images Predictionfrom a Single RGB Image Introduction -In 3D computer graphics a depth map is an image or image channel that contains information relating to the distance of the surfaces of scene objects from a viewpoint. -RGB-D image : a RGB image and its corresponding depth image -A depth image is an image channel in which each pixel relates to a distance between the image plane and the corresponding object in the RGB image.
5.
Depth Images Predictionfrom a Single RGB Image Introduction To approximate the depth of objects : • Stereo camera : camera with two/more lenses to simulate human vision. • Realsense or Kinect to get RGB-D images • Deep Learning..!!
Depth Images Predictionfrom a Single RGB Image Deep Learning for depth estimation : Recently, there are many works to estimate the depth map for RGB image.
8.
Depth Images Predictionfrom a Single RGB Image Deep Learning for depth estimation : Learning Fine-Scaled Depth Maps from Single RGB Images. 7 Feb 2017 Recently, there are many works to estimate the depth map for RGB image.
Depth Images Predictionfrom a Single RGB Image Dataset : NYU Depth V2 The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect.
11.
Depth Images Predictionfrom a Single RGB Image Dataset : NYU Depth V2 The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect.
12.
Depth Images Predictionfrom a Single RGB Image Dataset : NYU Depth V2 The dataset consists of : • 1449 labeled pairs of aligned RGB and depth images (2.8 GB). • 407,024 new unlabeled frames - raw rgb, depth (428 GB). • Toolbox: Useful functions for manipulating the data and labels. Different parts of the dataset can be downloaded individually. Authors : Nathan Silberman, Derek Hoiem, Pushmeet Kohli and Rob Fergus 2012
13.
Depth Images Predictionfrom a Single RGB Image Dataset : NYU Depth V2 The dataset consists of : • 1449 labeled pairs of aligned RGB and depth images (2.8 GB). • 407,024 new unlabeled frames - raw rgb, depth (428 GB). • Toolbox: Useful functions for manipulating the data and labels. Different parts of the dataset can be downloaded individually. Authors : Nathan Silberman, Derek Hoiem, Pushmeet Kohli and Rob Fergus 2012
14.
Depth Images Predictionfrom a Single RGB Image Dataset : NYU Depth V2 For this project: • Office 1-2 dataset (part of the whole dataset). • 15 GB after processing RAW data. • 3522 RGB-D images.
15.
Depth Images Predictionfrom a Single RGB Image Dataset : NYU Depth V2 For this project: • Office 1-2 dataset (part of the whole dataset). • 15 GB after processing RAW data. • 3522 RGB-D images. Split the data: 3522 20% 80% 2817 705 2414 403 Training Validation Test
Depth Images Predictionfrom a Single RGB Image The Model for Depth Estimation: Model proposed by JaN IVANECK in his master degree thesis -2016.
18.
Depth Images Predictionfrom a Single RGB Image The Model for Depth Estimation: Model proposed by JaN IVANECK in his master degree thesis -2016. He derived his model from Eigen et al. Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture. 17 Dec 2015
19.
Depth Images Predictionfrom a Single RGB Image The Model for Depth Estimation: Global context network estimates the rough depth map of the whole scene from the input RGB image.
20.
Depth Images Predictionfrom a Single RGB Image The Model for Depth Estimation: Gradient network estimates horizontal and vertical gradients of the depth map globally, for the whole RGB image.
21.
Depth Images Predictionfrom a Single RGB Image The Model for Depth Estimation: Refining network improves the rough estimate from the global context network, utilizing gradients estimated by the gradient network and an input RGB image.
22.
Depth Images Predictionfrom a Single RGB Image The Model for Depth Estimation: Global context network Architecture of the global context network The model is derived from AlexNet.
23.
Depth Images Predictionfrom a Single RGB Image Loss Function: Root mean squared error log(rms-log)
24.
Depth Images Predictionfrom a Single RGB Image Training The Network: 1- Scale the output images to [0 1]. 2-Subtraction 127 from input images to center the data (kind of normalization). 3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer Learning). 4-Training the network using batches (batch size = 32) for 35 Epochs. 5- Save the session and model in the end of each Epoch.
25.
Depth Images Predictionfrom a Single RGB Image Training The Network: 1- Scale the label images to [0 1]. 2-Subtraction 127 from input images to center the data (kind of normalization). 3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer Learning). 4-Training the network using batches (batch size = 32) for 35 Epochs. 5- Save the session and model in the end of each Epoch.
26.
Depth Images Predictionfrom a Single RGB Image Training The Network: 1- Scale the label images to [0 1]. 2-Subtraction 127 from input images to center the data (kind of normalization). 3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer Learning). 4-Training the network using batches (batch size = 32) for 35 Epochs. 5- Save the session and model in the end of each Epoch.
27.
Depth Images Predictionfrom a Single RGB Image Training The Network: 1- Scale the label images to [0 1]. 2-Subtraction 127 from input images to center the data (kind of normalization). 3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer Learning). 4-Training the network using batches (batch size = 32) for 35 Epochs. 5- Save the session and model in the end of each Epoch.
28.
Depth Images Predictionfrom a Single RGB Image Training The Network: 1- Scale the label images to [0 1]. 2-Subtraction 127 from input images to center the data (kind of normalization). 3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer Learning). 4-Training the network using batches (batch size = 32) for 35 Epochs. 5- Save the session and model in the end of each Epoch.
29.
Depth Images Predictionfrom a Single RGB Image Project Functions : 1- split_data : to split and save the data into training/testing/val.npy files. 2- load_data : load data from .npy files. 3- plot_imgs: to plot pair of images. 4- get_next_batch: to get the next batch from training data. 5- loss : calculate the loss function. 6- model: to create model (network structure).
30.
Depth Images Predictionfrom a Single RGB Image Project Functions : 7- train: to start training . 8- evaluate: to evaluate new data after restoring the model..
31.
Depth Images Predictionfrom a Single RGB Image Project Tools and Libraries: 1- Tensorflow. 2- Slim : lightweight library for defining, training and evaluating complex models in TensorFlow. 3- Tensorboard. 4- numpy. 5-matplotlib.
Depth Images Predictionfrom a Single RGB Image Project Results: Explanation : • Training data is not sufficient.
35.
Depth Images Predictionfrom a Single RGB Image Project Results: Explanation : • Training data is not sufficient. In Jan’s experiment: • Full NYU dataset and 3 dataset generated from the original one. • Network was trained for 100,000 iterations.
36.
Depth Images Predictionfrom a Single RGB Image Project Results: Explanation : • Training data is not sufficient. In Jan’s experiment: • Full NYU dataset and 3 dataset generated from the original one. • Network was trained for 100,000 iterations. This experiment: • It took ~26 hours for 30 Epochs.
37.
Depth Images Predictionfrom a Single RGB Image Project : The project code and data will be available on GitHub: https://github.com/SubhiH/Depth-Estimation-Deep-Learning
38.
Depth Images Predictionfrom a Single RGB Image Resources : -https://arxiv.org/pdf/1607.00730.pdf -http://janivanecky.com/ -http://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html