Hands-on Image Recognition with Scala, Spark and DeepLearning4j Presented by Guglielmo Iozzia Kiev, April 20th 2018
Something About Me  Big Data Delivery Lead at Optum (UHG)  Previously at IBM and FAO of the UN  Current fields of expertise are Big Data, ML/DL and DevOps  Past experience in JVM languages (Java, Groovy, Scala) development, test automation, CI/CD
Something About Me  Author of the upcoming book “Hands-on Deep Learning with Apache Spark”  I love preparing home-made pizza
The Dublin Tech Hub
Agenda  Scala for Data Science  Deep Learning  Definition  Multilayer Neural Networks  Apache Spark  Overview  DeepLearning4j  ETL  CNN  Image Recognition Example
Scala for Data Science  Most part of the systems/tools in the Big Data/ML space run on the JVM.  Robustness and performance when it comes to production system and large datasets.  Plenty of frameworks available:
Deep Learning  It is a subset of machine learning that can solve particularly hard and large-scale problems in areas such as natural language processing or image classification.  It is based on Multilayered Neural Networks.
Deep Learning
Artificial Neural Networks
Multilayer Neural Networks  The first layer, called the input layer, is where features are input.  The last layer is called the output layer.  Any layer that is not an input or output layer is a hidden layer.
Multilayer Neural Networks  Different variations.  CNN: used in image recognition.  RNN: used in NLP.
Apache Spark  It is an Open Source fast cluster-computing platform.  Data are loaded in distributed memory (RAM) over a cluster of machines.  Compared to Hadoop MapReduce, it runs programs up to 100x faster when the data fits in memory, or 10x faster on disk.  It provides support for Java, Scala, Python and R.
Apache Spark: job execution
Apache Spark: components
DeepLearning4j: Overview  It is an Open Source distributed Deep Learning library for the JVM.  It is integrated with Hadoop and Spark.  It provides support for both CPUs and GPUs.  It allows the import of neural net models from the most major frameworks via Keras.
DeepLearning4j: libraries  Deeplearning4J: Neural Net Platform  ND4J: Numpy for the JVM  DataVec: Tool for Machine Learning ETL Operations  JavaCPP: The Bridge Between Java and Native C++  Arbiter: Evaluation Tool for Machine Learning Algorithms  RL4J: Deep Reinforcement Learning for the JVM.
ETL with DL4j Live Example
DL4j CNN Live Example
Image Recognition Live Example
Useful Links Apache Spark: http://spark.apache.org/ DeepLearning4j: https://deeplearning4j.org/ Keras: https://keras.io/
Q & A
Wrap Up Linkedin: https://ie.linkedin.com/in/giozzia Twitter: @GuglielmoIozzia Blog: googlielmo.blogspot.com DZone: https://dzone.com/users/2532948/virtualramblas.html Hands-On Deep Learning with Apache Spark: https://www.packtpub.com/big-data-and-business-intelligence/hands- deep-learning-apache-spark

Hands on image recognition with scala spark and deep learning4j

  • 1.
    Hands-on Image Recognition withScala, Spark and DeepLearning4j Presented by Guglielmo Iozzia Kiev, April 20th 2018
  • 2.
    Something About Me Big Data Delivery Lead at Optum (UHG)  Previously at IBM and FAO of the UN  Current fields of expertise are Big Data, ML/DL and DevOps  Past experience in JVM languages (Java, Groovy, Scala) development, test automation, CI/CD
  • 3.
    Something About Me Author of the upcoming book “Hands-on Deep Learning with Apache Spark”  I love preparing home-made pizza
  • 4.
  • 5.
    Agenda  Scala forData Science  Deep Learning  Definition  Multilayer Neural Networks  Apache Spark  Overview  DeepLearning4j  ETL  CNN  Image Recognition Example
  • 6.
    Scala for DataScience  Most part of the systems/tools in the Big Data/ML space run on the JVM.  Robustness and performance when it comes to production system and large datasets.  Plenty of frameworks available:
  • 7.
    Deep Learning  Itis a subset of machine learning that can solve particularly hard and large-scale problems in areas such as natural language processing or image classification.  It is based on Multilayered Neural Networks.
  • 8.
  • 9.
  • 10.
    Multilayer Neural Networks The first layer, called the input layer, is where features are input.  The last layer is called the output layer.  Any layer that is not an input or output layer is a hidden layer.
  • 11.
    Multilayer Neural Networks Different variations.  CNN: used in image recognition.  RNN: used in NLP.
  • 12.
    Apache Spark  Itis an Open Source fast cluster-computing platform.  Data are loaded in distributed memory (RAM) over a cluster of machines.  Compared to Hadoop MapReduce, it runs programs up to 100x faster when the data fits in memory, or 10x faster on disk.  It provides support for Java, Scala, Python and R.
  • 13.
  • 14.
  • 15.
    DeepLearning4j: Overview  Itis an Open Source distributed Deep Learning library for the JVM.  It is integrated with Hadoop and Spark.  It provides support for both CPUs and GPUs.  It allows the import of neural net models from the most major frameworks via Keras.
  • 16.
    DeepLearning4j: libraries  Deeplearning4J:Neural Net Platform  ND4J: Numpy for the JVM  DataVec: Tool for Machine Learning ETL Operations  JavaCPP: The Bridge Between Java and Native C++  Arbiter: Evaluation Tool for Machine Learning Algorithms  RL4J: Deep Reinforcement Learning for the JVM.
  • 17.
  • 18.
  • 19.
  • 20.
    Useful Links Apache Spark:http://spark.apache.org/ DeepLearning4j: https://deeplearning4j.org/ Keras: https://keras.io/
  • 21.
  • 22.
    Wrap Up Linkedin: https://ie.linkedin.com/in/giozzia Twitter:@GuglielmoIozzia Blog: googlielmo.blogspot.com DZone: https://dzone.com/users/2532948/virtualramblas.html Hands-On Deep Learning with Apache Spark: https://www.packtpub.com/big-data-and-business-intelligence/hands- deep-learning-apache-spark