Page1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved One click Hadoop clusters - anywhere April 16th, 2015 Janos Matyas, Senior Director of Engineering
Page2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Overview • Introduction • Goals and motivations • Technology stack • How it works • Results/achievements/future plans • Demo and Q&A
Page3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Goals and motivations • Full Hadoop stack provisioning – everywhere • Automate and unify the process • Zero-configuration approach • Same process through a cluster lifecycle (Dev, QA, UAT, Prod) • Provide tooling - UI, REST API and CLI/shell • Secure and multi-tenant • SLA policy based autoscaling
Page4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Technology stack • Docker • Swarm • Consul • Apache Ambari
Page5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Docker • Container based virtualization • Lightweight and portable • Build once, run anywhere • Ease of packaging applications • Automated and scripted • Isolated
Page6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Docker – How it works • Containers are isolated, but share OS and bins/libraries • No need to emulate hardware
Page7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Swarm • Native clustering for Docker • Distributed container orchestration • Same API as Docker
Page8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Swarm – How it works • Swarm managers/agents • Discovery services • Advanced scheduling
Page9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Consul • Service discovery/registry • Health checking • Key/Value store • DNS • Multi datacenter aware
Page10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Consul – How it works • Consul servers/agents • Consistency through a quorum (RAFT) • Scalability due to gossip based protocol (SWIM) • Decentralized and fault tolerant • Highly available • Consistency over availability (CP) • Multiple interfaces - HTTP and DNS • Support for watches
Page11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Apache Ambari • Easy Hadoop cluster provisioning • Management and monitoring • Key feature - Blueprints • REST API, CLI shell • Extensible • Stacks • Services • Views
Page12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Apache Ambari – How it works • Ambari server/agents • Define a blueprint (blueprint.json) • Define a host mapping (hostmapping.json) • Post the cluster create
Page13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Cloudbreak Cloudbreak is a cloud-agnostic Hadoop as a Service API. Abstracts the provisioning and ease management and monitoring of on-demand clusters. Cloudbreak is a powerful left surf that breaks over a coral reef, a mile off southwest the island of Tavarua, Fiji.
Page14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Cloudbreak • Benefits • Zero configuration • Elastic • Secure • Infrastructure agnostic • Heterogenous clusters • Auto-scaling • Main REST resources • /template – specify an instance group infrastructure • /stack – creates an infrastructure based on a template • /blueprint – describes a Hadoop cluster • /cluster – creates a Hadoop cluster
Page15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Cloudbreak – How it works • Start VMs - with a running Docker daemon • Cloudbreak Bootstrap • Start Consul Cluster • Start Swarm Cluster (Consul for discovery) • Start Ambari servers/agents - Swarm API • Ambari services registered in Consul (Registrator) • Post Blueprint
Page16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Cloudbreak - Features • Extensible – easy to implement Service Provider Interface • Cloudbreak “recipes” • Automate host configuration • Pre/post Ambari lifecycle hooks • Services reconfiguration • Automate/execute custom actions • Side – effects • Ambari CLI/shell and Groovy based client • Cloud Foundry’s UAA Dockerized • Munchausen – bootstrap Swarm with Consul • Dockerized full Hadoop stack (Apache Hadoop 60K+, Ambari 12K+, Spark 10K+ downloads)
Page17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Cloudbreak - Hadoop as a Service API • Public tech preview • Microsoft Azure • Amazon AWS • Google Cloud Platform • OpenStack • Private tech preview – R&D • Bare metal • Rackspace Managed Cloud • HP Helion Public Cloud *integration SPI is available
Page18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Cloudbreak – SPI • Cloud providers have very different API, though model is very similar • Non – invasive implementation • One interface to implement - CloudPlatformConnector Network Security Group Image SubnetSubnet RulesRules Instance VolumeVolumes VolumeIP Address UserData Instance VolumeVolumes VolumeIP Address Instance VolumeVolumes VolumeIP Address
Page19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Periscope Periscope is a heuristic Hadoop scheduler associated with a QoS profile. Built on YARN schedulers, cloud and VM resource management API's it allows to associate SLA's to applications and customers. Periscope is a powerful, fast, thick and top- to-bottom right-hander, eastward from Sumbawa's famous west-coast.
Page20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Periscope • Benefits • Zero configuration • Metric and time based alarms • SLA policy based autoscaling • Secure • Hostgroup specific • Main REST resources • /clusters – specify a cluster to be monitored • /alerts– time and metric based • /policies – specify an SLA policy for a cluster based on an alarm • /applications – specify an SLA policy for an application (under development)
Page21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Periscope – How it works • Configures/monitors alarms in Ambari • Setup alarm, cooldown periods • Manages cluster sizes • Allow to associate SLA scaling policies to alarms • Orchestrates Cloudbreak to up/downscale the cluster
Page22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Demo and Q&A

One Click Hadoop Clusters - Anywhere (Using Docker)

  • 1.
    Page1 © HortonworksInc. 2011 – 2015. All Rights Reserved One click Hadoop clusters - anywhere April 16th, 2015 Janos Matyas, Senior Director of Engineering
  • 2.
    Page2 © HortonworksInc. 2011 – 2015. All Rights Reserved Overview • Introduction • Goals and motivations • Technology stack • How it works • Results/achievements/future plans • Demo and Q&A
  • 3.
    Page3 © HortonworksInc. 2011 – 2015. All Rights Reserved Goals and motivations • Full Hadoop stack provisioning – everywhere • Automate and unify the process • Zero-configuration approach • Same process through a cluster lifecycle (Dev, QA, UAT, Prod) • Provide tooling - UI, REST API and CLI/shell • Secure and multi-tenant • SLA policy based autoscaling
  • 4.
    Page4 © HortonworksInc. 2011 – 2015. All Rights Reserved Technology stack • Docker • Swarm • Consul • Apache Ambari
  • 5.
    Page5 © HortonworksInc. 2011 – 2015. All Rights Reserved Docker • Container based virtualization • Lightweight and portable • Build once, run anywhere • Ease of packaging applications • Automated and scripted • Isolated
  • 6.
    Page6 © HortonworksInc. 2011 – 2015. All Rights Reserved Docker – How it works • Containers are isolated, but share OS and bins/libraries • No need to emulate hardware
  • 7.
    Page7 © HortonworksInc. 2011 – 2015. All Rights Reserved Swarm • Native clustering for Docker • Distributed container orchestration • Same API as Docker
  • 8.
    Page8 © HortonworksInc. 2011 – 2015. All Rights Reserved Swarm – How it works • Swarm managers/agents • Discovery services • Advanced scheduling
  • 9.
    Page9 © HortonworksInc. 2011 – 2015. All Rights Reserved Consul • Service discovery/registry • Health checking • Key/Value store • DNS • Multi datacenter aware
  • 10.
    Page10 © HortonworksInc. 2011 – 2015. All Rights Reserved Consul – How it works • Consul servers/agents • Consistency through a quorum (RAFT) • Scalability due to gossip based protocol (SWIM) • Decentralized and fault tolerant • Highly available • Consistency over availability (CP) • Multiple interfaces - HTTP and DNS • Support for watches
  • 11.
    Page11 © HortonworksInc. 2011 – 2015. All Rights Reserved Apache Ambari • Easy Hadoop cluster provisioning • Management and monitoring • Key feature - Blueprints • REST API, CLI shell • Extensible • Stacks • Services • Views
  • 12.
    Page12 © HortonworksInc. 2011 – 2015. All Rights Reserved Apache Ambari – How it works • Ambari server/agents • Define a blueprint (blueprint.json) • Define a host mapping (hostmapping.json) • Post the cluster create
  • 13.
    Page13 © HortonworksInc. 2011 – 2015. All Rights Reserved Cloudbreak Cloudbreak is a cloud-agnostic Hadoop as a Service API. Abstracts the provisioning and ease management and monitoring of on-demand clusters. Cloudbreak is a powerful left surf that breaks over a coral reef, a mile off southwest the island of Tavarua, Fiji.
  • 14.
    Page14 © HortonworksInc. 2011 – 2015. All Rights Reserved Cloudbreak • Benefits • Zero configuration • Elastic • Secure • Infrastructure agnostic • Heterogenous clusters • Auto-scaling • Main REST resources • /template – specify an instance group infrastructure • /stack – creates an infrastructure based on a template • /blueprint – describes a Hadoop cluster • /cluster – creates a Hadoop cluster
  • 15.
    Page15 © HortonworksInc. 2011 – 2015. All Rights Reserved Cloudbreak – How it works • Start VMs - with a running Docker daemon • Cloudbreak Bootstrap • Start Consul Cluster • Start Swarm Cluster (Consul for discovery) • Start Ambari servers/agents - Swarm API • Ambari services registered in Consul (Registrator) • Post Blueprint
  • 16.
    Page16 © HortonworksInc. 2011 – 2015. All Rights Reserved Cloudbreak - Features • Extensible – easy to implement Service Provider Interface • Cloudbreak “recipes” • Automate host configuration • Pre/post Ambari lifecycle hooks • Services reconfiguration • Automate/execute custom actions • Side – effects • Ambari CLI/shell and Groovy based client • Cloud Foundry’s UAA Dockerized • Munchausen – bootstrap Swarm with Consul • Dockerized full Hadoop stack (Apache Hadoop 60K+, Ambari 12K+, Spark 10K+ downloads)
  • 17.
    Page17 © HortonworksInc. 2011 – 2015. All Rights Reserved Cloudbreak - Hadoop as a Service API • Public tech preview • Microsoft Azure • Amazon AWS • Google Cloud Platform • OpenStack • Private tech preview – R&D • Bare metal • Rackspace Managed Cloud • HP Helion Public Cloud *integration SPI is available
  • 18.
    Page18 © HortonworksInc. 2011 – 2015. All Rights Reserved Cloudbreak – SPI • Cloud providers have very different API, though model is very similar • Non – invasive implementation • One interface to implement - CloudPlatformConnector Network Security Group Image SubnetSubnet RulesRules Instance VolumeVolumes VolumeIP Address UserData Instance VolumeVolumes VolumeIP Address Instance VolumeVolumes VolumeIP Address
  • 19.
    Page19 © HortonworksInc. 2011 – 2015. All Rights Reserved Periscope Periscope is a heuristic Hadoop scheduler associated with a QoS profile. Built on YARN schedulers, cloud and VM resource management API's it allows to associate SLA's to applications and customers. Periscope is a powerful, fast, thick and top- to-bottom right-hander, eastward from Sumbawa's famous west-coast.
  • 20.
    Page20 © HortonworksInc. 2011 – 2015. All Rights Reserved Periscope • Benefits • Zero configuration • Metric and time based alarms • SLA policy based autoscaling • Secure • Hostgroup specific • Main REST resources • /clusters – specify a cluster to be monitored • /alerts– time and metric based • /policies – specify an SLA policy for a cluster based on an alarm • /applications – specify an SLA policy for an application (under development)
  • 21.
    Page21 © HortonworksInc. 2011 – 2015. All Rights Reserved Periscope – How it works • Configures/monitors alarms in Ambari • Setup alarm, cooldown periods • Manages cluster sizes • Allow to associate SLA scaling policies to alarms • Orchestrates Cloudbreak to up/downscale the cluster
  • 22.
    Page22 © HortonworksInc. 2011 – 2015. All Rights Reserved Demo and Q&A

Editor's Notes

  • #2 Two days ago I was working for SequenceIQ, as the CTO.
  • #3  ----- Meeting Notes (10/04/15 20:35) ----- SequenceIQ been acquired. Started February, quickly gain trackion around June.
  • #4  ----- Meeting Notes (10/04/15 20:38) ----- We were doing this over and over again. Scripted, Ansible, tried everything and all existing tools.
  • #5  ----- Meeting Notes (10/04/15 20:38) ----- Architecturally most important components
  • #7  ----- Meeting Notes (10/04/15 20:56) ----- Under the hood is built on: 1. cgroup and namespacing capabilities of the Linux kernel 2. Docker image specification - filesystem composed of layers, presented as one cohesive filesystem Recommended 3.8, works from 2.6.2 3. Libcontainer specification - namespacing, filesystem, resources (cgroups)
  • #8  ----- Meeting Notes (10/04/15 20:56) ----- Docker simplifies things - on one host. We span up containers remotely on many hosts- how? Swarm pulls together many Docker engines - presents as one virtual Docker Engine.
  • #9  ----- Meeting Notes (10/04/15 20:56) ----- Steps: Can span us Docker containers remotely on hosts considering: 1. Resource management - aware of the cluster resources (e.g. can schedule it with bin packing - anywhere where 1GB memory is available) or randomly 2. Constraints using labels (label one node and stsrt the container based on labels) 3. Affinity - containers can be co-scheduled (link, vollumes-from, net=container on the same host)
  • #10  ----- Meeting Notes (10/04/15 21:05) ----- We have a dynamic scaling cluster where nodes are coming/leaving but also failing. Register services in consul, like Ambari services Zookeeper, doozerd, etcd – same as Consul, requires a quorom, offer strong consistency, but not datacenter aware Zookeeper: no service discovery, offers primitive K/V, no DNS, does not go through DC Zookeeper provides ephemeral nodes – but stil clients need to habe keep-alive connections
  • #11 Agent – long running daemon, serves DNS and HTTP interface, every node Client – an agent that forwards all RPC to server. Takes part in LAN gossip Server - participates in RAFT quorum, responds to RPC, WAN gossip Datacenter – low latency, high bandwith private network Gossip – TCP and UDP UNICAST. Usually Broadcast/Multicast does not work in cloud Strong consistency: Service catalog stores all the nodes, service instances, health check data, ACLs, and Key/Value information. It is strongly consistent, and replicated using the consensus protocol. Gossip – eventual consistency, updates to catalog comes through gossip, thus state can lag behind until is reconciled.
  • #12 Most likely you’ve seen an Ambari session Its extensible : Stacks – set of services, multiple versions (e.g. HDP 2.1, HDP 2.2, Bigtop) Services – e.g HDFS, Kafka, Zeppelin Views – capability to add visualization, management and monitoring capabilities of a new “application”
  • #13 Pre-install the server and agents.
  • #14 Combining all these – welcome Cloudbreak. Zero configuration way to provision HDP cluters – anywhere by the push of a button, CLI or API. One consistent infrastructure agnostic API.
  • #15  ----- Meeting Notes (10/04/15 21:47) ----- Expand on points No configuration, need to have a running infrastructure. Any size - 200 nodes in 8 min. OAuth2, gateway (Knox will come), TLS Since YARN - Different services - different instance types: e.g. Spark - high memory, Kafka - high disk thorughput but memory as well to buffer active read/writes Scale based on load
  • #16 View from 10000 meter high Only thing we need is a Docker daemon. All cloud providers are going towards Docker
  • #17 Kerberos – we take the pain (Dockerized a Kerberos server) Recipes – built on Consul events, read results from the K/V store Anybody can push his own plugin: we use plugn – instal lyour plugin, and use it from Cloudbreak We did different projects, fixed quite a few interesting problems.
  • #21 Zero config, does not require pre-installation Can set alarms – based on alarms SLA policies. ----- Meeting Notes (10/04/15 22:04) ----- New features in hadoop 2.6 Our contribution, plus lots of others (move applications between queues), admission control - reserve capacity over time Most likely Vinod explained all these.
  • #22 Mention Baywatch ELK ElasticSearch, Logstash, Kibana – aggregate logs and metrics.
  • #23 Will be a Webex