Object Extraction from Satellite Imagery using Deep Learning Aly Osama
Presenter • Aly Osama • Research Software Development Engineer at Microsoft • Contact: • Email: alyosamah@gmail.com • https://www.linkedin.com/in/alyosa ma/ • https://github.com/alyosama
Agenda 1. Needed data, its size, 2. How training will go, 3. How evaluation should be carried, 4. What learning tools would you see and why? 5. Literature survey on the subject 6. Code sample/preliminary results
Overview Key Words:- 1. Satellite Imagery 2. Object Extraction 3. Deep Learning 4. Python
1. Satellite Imagery • Objects are often very small (~20 pixels in size ) as example 0.5m/pixel • Input images are enormous (often hundreds of megapixels). • Image has more than 3 channels (RGB) called bands. • Image Format: • Images (GeoTiff , … etc ) • Labels ( GeoJSON , WTK ) • On the positive side, • The physical and pixel scale of objects are usually known in advance • There’s a low variation in observation angle.
Datasets • SpaceNet • Source : https://aws.amazon.com/public- datasets/spacenet/ • Tutorial https://medium.com/the- downlinq/building-extraction-with-yolt2-and- spacenet-data-a926f9ffac4f • 5 Solutions: https://github.com/SpaceNetChallenge/BuildingD etectors • DSTL Satellite Imagery (Kaggle Competition ) • https://www.kaggle.com/c/dstl-satellite-imagery- feature-detection • Tutorial : https://www.kaggle.com/torrinos/dstl- satellite-imagery-feature-detection/exploration- and-plotting/run/553107 • DeepSat • UC Merced land • Satellite Websites • https://modis.gsfc.nasa.gov/ • https://sentinel.esa.int/web/sentinel/home • https://landsat.usgs.gov/
2. Training • Experiments 1. VGGNet. - (Baseline) • Tune the pretrained model – Transfer Learning – based on available data • Data Augmentation like • Random Crops / Scales • Color Jitter 2. Faster RCNN or YOLO • For Detection and Localization
General Tips
3.Evaluation • Accuracy :D • Precession and Recall • Jaccard index ( Intersection over Union)
4. tools • Hardware • Deep Learning AMI Amazon Linux Version • https://aws.amazon.com/marketplace/pp/B01M0AXXQB • Powerful Machine with GPUs • NVIDIA GTX Titan X • Software • Caffe + Python • Pretrained models • Lasagne / Keras Framework • High level Python • Backend independent “Tensorflow or Theano” • Multi-GPU • Caffe2 ( 18-4-2017 ) • Utility • QGIS is an open-source tool for managing and editing GeoTIFFs and geoJSON files
Tools Comparison
5. Literature survey on the subject
1- A Survey on Object Detection in Optical Remote Sensing Images 2016 Gong Cheng, Junwei Han*
Cheng et al. 2016
Cheng et al. 2016 • Deep Learning Papers • (Han et al., 2015;Tang et al., 2015; Wang et al., 2015; Zhou et al., 2015b) • Datasets • NWPU VHR-10 dataset (Cheng et al., 2014a)1. • SZTAKI-INRIA building detection dataset (Benedek et al., 2012)2. • TAS aerial car detection dataset (Heitz and Koller, 2008)3. • Overhead imagery research dataset (OIRDS) (Tanner et al., 2009) • IITM road extraction dataset (Das et al., 2011)5.
FAST AIRCRAFT DETECTION IN SATELLITE IMAGES BASED ON CONVOLUTIONAL NEURAL NETWORKS (Wu et. Al.) 2015
Region Proposals TODO : Search on R-CNN
FULLY CONVOLUTIONAL NETWORKS FOR BUILDING AND ROAD EXTRACTION: PRELIMINARY RESULTS 2016 Zilong Zhong • Dataset • Massachusetts’ road dataset and building dataset • each image consists of 3×1500×1500 pixels • contains 1,711 aerial images, • the FCN’s computation consumption could be much higher than that of the ordinary object recognition models.
Building detection in very high resolution multispectral data with deep learning features 2016 • AlexNet -> Features + SVM ( Last layer )
Road network extraction a neural-dynamic framework based on deep learning and a finite state machine wang2015 CNN + FSM ( for sequence )
Do Deep Features Generalize from Everyday Objects to Remote Sensing and Aerial Scenes Domains? 2016 Ot´avio A. B. Penatti • ConvNet using Caffe and OverFeat
Using convolutional networks and satellite imagery to identify pa.erns in urban environments at a large scale ADRIAN ALBERT*, Massachuse.s Institute of Technology2017 • Dataset : • UC Merced land use dataset [25] (of 2100 images spanning 21 classes) • DeepSat land use benchmark dataset ( 4 channels • (VGGNet and ResNet)
6. Code sample/prelim inary results • (Test) DSTL • https://www.kaggle.com/alyosama/dstl- satellite-imagery-feature- detection/convnet-baseline/
YOLT2 • The actual F1 score of 0.21 • Jaccard index between 0.4 and 0.5 https://medium.com/the-downlinq/building-extraction-with-yolt2-and-spacenet-data-a926f9ffac4fT
CosmiQNet • Blackbox Fully Convolution Neural Network: CosmiQNet. The inputs are at two resolutions and the output distance transform matches the lower of the input resolutions. The resolution of the 8-band GeoTIFF is roughly one quarter (in each dimension) the resolution of the the 3-band GeoTIFF; the difference in resolution is depicted by the scale of the GeoTIFFs. https://medium.com/the-downlinq/object-detection-on- spacenet-5e691961d257
Resources 1. DSTL Satellite Imagery Competiton 2. https://medium.com/@avanetten 3. https://www.kernix.com/blog/image- classification-with-a-pre-trained-deep- neural-network_p11
alyosamah@gmail.com Thank you

Object extraction from satellite imagery using deep learning

  • 1.
    Object Extraction from SatelliteImagery using Deep Learning Aly Osama
  • 2.
    Presenter • Aly Osama •Research Software Development Engineer at Microsoft • Contact: • Email: alyosamah@gmail.com • https://www.linkedin.com/in/alyosa ma/ • https://github.com/alyosama
  • 3.
    Agenda 1. Needed data,its size, 2. How training will go, 3. How evaluation should be carried, 4. What learning tools would you see and why? 5. Literature survey on the subject 6. Code sample/preliminary results
  • 4.
    Overview Key Words:- 1. SatelliteImagery 2. Object Extraction 3. Deep Learning 4. Python
  • 5.
    1. Satellite Imagery •Objects are often very small (~20 pixels in size ) as example 0.5m/pixel • Input images are enormous (often hundreds of megapixels). • Image has more than 3 channels (RGB) called bands. • Image Format: • Images (GeoTiff , … etc ) • Labels ( GeoJSON , WTK ) • On the positive side, • The physical and pixel scale of objects are usually known in advance • There’s a low variation in observation angle.
  • 6.
    Datasets • SpaceNet • Source: https://aws.amazon.com/public- datasets/spacenet/ • Tutorial https://medium.com/the- downlinq/building-extraction-with-yolt2-and- spacenet-data-a926f9ffac4f • 5 Solutions: https://github.com/SpaceNetChallenge/BuildingD etectors • DSTL Satellite Imagery (Kaggle Competition ) • https://www.kaggle.com/c/dstl-satellite-imagery- feature-detection • Tutorial : https://www.kaggle.com/torrinos/dstl- satellite-imagery-feature-detection/exploration- and-plotting/run/553107 • DeepSat • UC Merced land • Satellite Websites • https://modis.gsfc.nasa.gov/ • https://sentinel.esa.int/web/sentinel/home • https://landsat.usgs.gov/
  • 7.
    2. Training • Experiments 1.VGGNet. - (Baseline) • Tune the pretrained model – Transfer Learning – based on available data • Data Augmentation like • Random Crops / Scales • Color Jitter 2. Faster RCNN or YOLO • For Detection and Localization
  • 8.
  • 9.
    3.Evaluation • Accuracy :D •Precession and Recall • Jaccard index ( Intersection over Union)
  • 10.
    4. tools • Hardware •Deep Learning AMI Amazon Linux Version • https://aws.amazon.com/marketplace/pp/B01M0AXXQB • Powerful Machine with GPUs • NVIDIA GTX Titan X • Software • Caffe + Python • Pretrained models • Lasagne / Keras Framework • High level Python • Backend independent “Tensorflow or Theano” • Multi-GPU • Caffe2 ( 18-4-2017 ) • Utility • QGIS is an open-source tool for managing and editing GeoTIFFs and geoJSON files
  • 11.
  • 12.
  • 13.
    1- A Surveyon Object Detection in Optical Remote Sensing Images 2016 Gong Cheng, Junwei Han*
  • 14.
  • 15.
    Cheng et al.2016 • Deep Learning Papers • (Han et al., 2015;Tang et al., 2015; Wang et al., 2015; Zhou et al., 2015b) • Datasets • NWPU VHR-10 dataset (Cheng et al., 2014a)1. • SZTAKI-INRIA building detection dataset (Benedek et al., 2012)2. • TAS aerial car detection dataset (Heitz and Koller, 2008)3. • Overhead imagery research dataset (OIRDS) (Tanner et al., 2009) • IITM road extraction dataset (Das et al., 2011)5.
  • 16.
    FAST AIRCRAFT DETECTIONIN SATELLITE IMAGES BASED ON CONVOLUTIONAL NEURAL NETWORKS (Wu et. Al.) 2015
  • 17.
    Region Proposals TODO :Search on R-CNN
  • 18.
    FULLY CONVOLUTIONAL NETWORKSFOR BUILDING AND ROAD EXTRACTION: PRELIMINARY RESULTS 2016 Zilong Zhong • Dataset • Massachusetts’ road dataset and building dataset • each image consists of 3×1500×1500 pixels • contains 1,711 aerial images, • the FCN’s computation consumption could be much higher than that of the ordinary object recognition models.
  • 19.
    Building detection invery high resolution multispectral data with deep learning features 2016 • AlexNet -> Features + SVM ( Last layer )
  • 20.
    Road network extractiona neural-dynamic framework based on deep learning and a finite state machine wang2015 CNN + FSM ( for sequence )
  • 21.
    Do Deep FeaturesGeneralize from Everyday Objects to Remote Sensing and Aerial Scenes Domains? 2016 Ot´avio A. B. Penatti • ConvNet using Caffe and OverFeat
  • 22.
    Using convolutional networksand satellite imagery to identify pa.erns in urban environments at a large scale ADRIAN ALBERT*, Massachuse.s Institute of Technology2017 • Dataset : • UC Merced land use dataset [25] (of 2100 images spanning 21 classes) • DeepSat land use benchmark dataset ( 4 channels • (VGGNet and ResNet)
  • 23.
    6. Code sample/prelim inary results •(Test) DSTL • https://www.kaggle.com/alyosama/dstl- satellite-imagery-feature- detection/convnet-baseline/
  • 24.
    YOLT2 • The actualF1 score of 0.21 • Jaccard index between 0.4 and 0.5 https://medium.com/the-downlinq/building-extraction-with-yolt2-and-spacenet-data-a926f9ffac4fT
  • 25.
    CosmiQNet • Blackbox FullyConvolution Neural Network: CosmiQNet. The inputs are at two resolutions and the output distance transform matches the lower of the input resolutions. The resolution of the 8-band GeoTIFF is roughly one quarter (in each dimension) the resolution of the the 3-band GeoTIFF; the difference in resolution is depicted by the scale of the GeoTIFFs. https://medium.com/the-downlinq/object-detection-on- spacenet-5e691961d257
  • 26.
    Resources 1. DSTL SatelliteImagery Competiton 2. https://medium.com/@avanetten 3. https://www.kernix.com/blog/image- classification-with-a-pre-trained-deep- neural-network_p11
  • 27.