© 2014 IBM Corporation Platform Computing 1 IBM Platform Computing Elastic Storage Gord Sissons Platform Symphony Product Marketing Scott Campbell Platform Symphony Product Manager Rohit Valia Director, Product Marketing
© 2014 IBM Corporation Platform Computing Traditional Storage 2 Filer 1 Filer 2
© 2014 IBM Corporation Platform Computing Traditional Storage 3 Filer 1 Filer 2 Filer 3 Filer 4 Filer 5 Filer 6 Filer 7 Filer 8
Solution: Global workload sharing, resource balanced storage © 2014 IBM Corporation Platform Computing 4 Filer 1 Filer 2 Filer 3 Filer 4 Global Namespace with automated storage tiering
Elastic Storage provides extraordinarily parallel, scale-out storage © 2014 IBM Corporation Platform Computing 5 Global Namespace with automated storage tiering
Proven Reliability © 2014 IBM Corporation Platform Computing Elastic Storage – Key features 6 Extreme Scalability Maximum file system size: 1 Million Yottabytes 2^64 files per file system Maximum file size equals file system size Customers with 18 PB file systems IPv6 Futureproof Commodity hardware Snapshots, replication Built-in heartbeat, automatic failover/failback Add/remove on the fly Rolling upgrades Administer from any node Commodity hardware High Performance Parallel file access Distributed, scalable, high performance metadata Flash acceleration Automatic tiering Over 400 GB/s Commodity hardware
© 2014 IBM Corporation Platform Computing Supported storage hardware In addition to IBM Storage, IBM General Parallel File System (GPFS™) supports storage hardware from these vendors: EMC Hitachi Hewlett Packard DDN 7 GPFS supports many storage systems, and the IBM support team can help customers using storage hardware solutions not on this list of tested devices.
General Parallel File System (GPFS™) for IBM POWER Systems™ is supported on both IBM AIX® and Linux®. © 2014 IBM Corporation Platform Computing Supported server hardware GPFS for x86 Architecture is supported on multiple x86 and AMD compatible systems:  IBM Intelligent Cluster  IBM iDataPlex®  IBM System x® rack-optimized servers  IBM BladeCenter® servers  Non-IBM x86 and AMD compatible servers 8  System p™  BladeCenter servers  IBM Blue Gene®  IBM System p® General Parallel File System (GPFS™) for x86 Architecture™ is supported on both Linux® and Windows Server 2008. GPFS for Power is supported on multiple IBM POWER platforms:
© 2014 IBM Corporation Platform Computing Sharing Data Across an Organization 9 GPFS introduced concurrent file system access from multiple nodes. Multi-cluster expands the global namespace by connecting multiple sites AFM takes global namespace truly global by automatically managing asynchronous replication of data GPFS GPFS GPFS GPFS GPFS GPFS 1993 2005 2011
© 2014 IBM Corporation Platform Computing Global Namespace 10 Clients access: /global/data1 /global/data2 /global/data3 /global/data4 /global/data5 /global/data6 Clients access: /global/data1 /global/data2 /global/data3 /global/data4 /global/data5 /global/data6 Clients access: /global/data1 /global/data2 /global/data3 /global/data4 /global/data5 /global/data6 File System: store1 Cache Filesets: /data1 /data2 Local Filesets: /data3 /data4 Cache Filesets: /data5 /data6 File System: store2 Local Filesets: /data1 /data2 Cache Filesets: /data3 /data4 Cache Filesets: /data5 /data6 File System: store3 Cache Filesets: /data1 /data2 Cache Filesets: /data3 /data4 Local Filesets: /data5 /data6 See all data from any Cluster Cache as much data as required or fetch data on demand
© 2014 IBM Corporation Platform Computing Elastic Storage Data Life Cycle Management 11 Single Name Space Elastic Storage SSD CIFS File-System SAS SATA TSM LTFS HPSS Use Elastic Storage file set and ILM policies to control data placement, deletion and movement across storage tiers (pools)
© 2014 IBM Corporation Platform Computing A Typical Hadoop HDFS Environment 12 MapReduce Cluster NFS Filers M a p R e d u c e Jobs Users H D F S  Uses disk local to each server  Aggregates the local disk space into a single, redundant shared file system  The open source standard file systems used in partnership with Hadoop MapReduce
Hadoop MapReduce Environment Using Elastic Storage FPO © 2014 IBM Corporation Platform Computing 13 MapReduce Cluster NFS Filers M a p R e d u c e Jobs Users Elastic Storage FPO  Uses disk local to each server  Aggregates the local disk space into a single redundant shared file system  Designed for MapReduce workloads  Unlike HDFS, GPFS-FPO is POSIX compliant – so data maintenance is easy  Intended as a drop in replacement for open source HDFS (IBM BigInsights product may be required)
© 2014 IBM Corporation Platform Computing 14 Cloud Tier (ICStore) • IBM Public Cloud • Amazon S3 • MS Azure • Private Cloud The Vision Analytics File Storage Media Data Ingest Solid State Spinning Disk Tape ESS World-Wide Data Distribution POSIX NFS MAP Reduce Object Elastic Storage • Single name space no matter where data resides • Data in best location, on the best tier (performance & cost), at the right time • Multi-tenancy • All in software
© 2014 IBM Corporation Platform Computing 15
© 2014 IBM Corporation Platform Computing 16 Architecture
© 2014 IBM Corporation Platform Computing Elastic Storage Cluster Models 17 TCP/IP or Infiniband Network Storage Storage Storage TCP/IP or Infinband RDMA Network Storage Network TCP/IP or Infinband Network Application Nodes NSD Servers Application Nodes
© 2014 IBM Corporation Platform Computing 18 Features
Basics Standard © 2014 IBM Corporation Platform Computing Elastic Storage Key Features more details 19 Distributed journaled file system, scalable, high performance metadata AIX, Linux and Windows Single name space File parallel access Built-in heartbeat, automatic failover/failback, quorum Administer from any node Add/remove on the fly servers or disks Rolling upgrades Basics SNMP (running on a Linux node) Snapshots, backup, replication Filesets, quotas Active/active dual site with synchronous replication Multi-cluster Server internal disks (FPO) Flash acceleration (LROC Linux) File clones Automatic tiering (ILM), even to tapes with HSM software Geographic asynchronous caching (AFM) Clustered NFS servers (cNFS Linux) to give access beyond the elastic storage cluster Advanced Native encryption Secure deletion
Elastic Storage Manages the Full Data Lifecycle Cost Effectively © 2014 IBM Corporation Platform Computing • Policy-driven automation and tiered storage management • Match the cost of storage to the value of data • Storage pools create tiers of storage 20 Application servers Elastic Storage Server Or commodity hardware Tape Library Autotiering and Migration ‒ High performance SSD ‒ High speed SAS drives ‒ High capacity NL SAS drives • Integrated with IBM Tivoli Storage Manager (TSM) and IBM LTFS Enterprise Edition (EE) ‒ Elastic Storage handles all metadata processing then hands the data to TSM and LTFS EE for storage on tape ‒ Data is retrieved from the external storage pool on demand, as a result of an application opening a file for example ‒ Policies move data from one pool to another without changing the file’s location in the directory structure • Tape Migration Bottom Line: ‒Cuts storage costs up to 90% • Right Data • Right Place • Right Time • Right Performance • Right Cost
© 2014 IBM Corporation Platform Computing Flash Local Read Only Cache (LROC) 21 Clients Flash LROC SSDs Elastic Storage • Inexpensive SSDs placed directly in Client nodes • Accelerates I/O performance up to 6x by reducing the amount of time CPUs wait for data • Also decreases the overall load on the network, benefitting performance across the board • Improves application performance while maintaining all the manageability benefits of shared storage • Cache consistency ensured by standard tokens • Data is protected by checksum and verified on read • Elastic Storage handles the flash cache automatically so data is transparently available to your application with very low latency and no code changes
© 2014 IBM Corporation Platform Computing Elastic Storage : Tiering to tape with LTFS/EE • Automatic migration to tape • File user does not see where file is stored • Scales by adding tape drives or nodes • Load is balanced on nodes and drives • Tapes can be exported/imported. • Redbook : IBM Linear Tape File System Enterprise Edition V1.1 Installation and Configuration Guide sg248143 22 GPFS Node 1 Users and applications User data TSxxxx Tape Library LTFS EE GPFS Node 2 LTFS EE Global name space GPFS file systems (user data and metadata)
© 2014 IBM Corporation Platform Computing 23 File Placement Optimizer (GPFS-FPO)
© 2014 IBM Corporation Platform Computing Elastic Storage - FPO 24 GPFS  Use disk local to each server  All Nodes are NSD servers and NSD Clients  Designed for map reduce workloads
© 2014 IBM Corporation Platform Computing Elastic Storage advanced storage for Hadoop Hadoop HDFS IBM GPFS-FPO Advantages 25 HDFS NameNode is a single point of failure Large block-sizes – poor support for small files Non-POSIX file system – obscure commands Difficulty to ingest data – special tools required Single-purpose, Hadoop MapReduce only Not recommended for critical data No single point of failure, distributed metadata Variable block sizes – suited to multiple types of data and data access patterns POSIX file system – easy to use and manage Policy based data ingest Versatile, Multi-purpose Enterprise Class advanced storage features
© 2014 IBM Corporation Platform Computing 26 OpenStack
OpenStack Delivers a Massively Scalable Cloud Operating System © 2014 IBM Corporation Platform Computing 27 OpenStack Mission: To produce the ubiquitous open source cloud computing platform that will meet the needs of public and private cloud providers regardless of size, by being simple to implement and massively scalable
© 2014 IBM Corporation Platform Computing 28 Horizon Nova Cinder Swift Neutron Glance Keystone OpenStack Key Components
© 2014 IBM Corporation Platform Computing OpenStack GPFS Cinder Driver • OpenStack Havana release includes a GPFS Cinder driver 29 – Giving architects access to the features and capabilities of the industry’s leading enterprise scale-out software defined storage • With OpenStack on GPFS, all nodes see all data – Copying data between services, like Glance to Cinder is minimized or eliminated – Speeding instance creation and conserving storage space • Rich set of data management and information lifecycle features - Volume Placement: On GPFS storage pools or FPO based placement - Resilience: Per-volume replication level, DIO volumes - Storage migration: Transparent or user-directed migration of volumes between GPFS storage pools, GPFS nodes or to other Cinder back ends - Glance Integration: Convert a volume to an image or an image to a volume through COW mechanism – fast mechanism for instance provisioning and capture
© 2014 IBM Corporation Platform Computing 30 Competition
© 2014 IBM Corporation Platform Computing 31 31 Business Problem IBM GPFS Lustre EMC Isilion IBRIX Fusion HDFS MAPR POSIX Interface Yes Yes Yes Yes No Yes Multi-OS Support Yes Linux only N/A No No No Hadoop FS API or location aware connector Yes No Yes Yes Yes Lifecycle Management, Tape archival Yes No No No No Global name space Yes No Yes No Distributed meta data Yes No Yes Yes Expand capability on-line Yes No WAN caching / replication Yes No No No File system snapshots Yes Quotas Yes Open source No Yes No Yes No Commercial support Yes Yes Oracle, Cray, Bull, SGI & others Yes Yes HP Yes Cloudera, IBM & others Yes IBM GPFS vs. Competitors
© 2014 IBM Corporation Platform Computing 32 Elastic Storage – Editions
© 2014 IBM Corporation Platform Computing Elastic Storage - New Pricing Structure 33 Server and Client for Each Socket Based Licensing • Simpler, no more PVUs Express Edition • gpfs.base (no ilm, afm, cnfs) • gpfs.docs • gpfs.gpl • gpfs.msg • gpfs.gskit Standard Edition • Add gpfs.ext Advanced Edition • Add – gpfs.crypto Platforms • zLinux • Ubuntu Features Express Edition Standard Edition Advanced Edition Basic GPFS functionality ILM: Storage pools, Policy, mmbackup Active File Management (AFM) Clustered NFS (cNFS) Encryption
Client license Server license © 2014 IBM Corporation Platform Computing Elastic Storage Cluster Models 34 TCP/IP or Infiniband Network Storage Storage Storage TCP/IP or Infinband RDMA Network Storage Network TCP/IP or Infinband Network Application Nodes NSD Servers Application Nodes FPO license Server license
© 2014 IBM Corporation Platform Computing 35 Elastic Storage Server
Replaces Specialized hardware controller with software © 2014 IBM Corporation Platform Computing 36 36  Delivers Extreme Data Integrity – 2- and 3-fault-tolerant erasure codes – End-to-end checksum – Protection against lost writes – Fastest rebuild times using Declustered RAID  Breakthrough Performance – Declustered RAID reduces app load during rebuilds – Up to 3x lower overhead to applications – Built-in SSDs and NVRAM for write performance – Faster than alternatives today – and tomorrow!  Lowers TCO – 3 Years Maintenance and Support – General Purpose Servers – Off-the-shelf SBODs – Standardized in-band SES management – Standard Linux – Modular Upgrades Elastic Storage Server
© 2014 IBM Corporation Platform Computing Elastic Storage Server GL Models 37 37 Model GL4 Analytics and Cloud 4 Enclosures, 20U 232 NL-SAS, 2 SSD 10+ GB/Sec Model GL6 PetaScale Storage 6 Enclosures, 28U 348 NL-SAS, 2 SSD 12+ GB/sec 5146 Machine Type Model GL2 Analytics Focused 2 Enclosures, 12U 116 NL-SAS, 2 SSD 5+ GB/Sec •Power S822L Servers •20 Cores Each •1818-80e Expansion Chassis •Red Hat 7 •Graphical User Interface •Management Server and HMC •Elastic Storage Software •Elastic Storage Native RAID •xCat or Platform Cluster Mgr. Opt. •10 Gb, 40 Gb Enet, FDR Inifiniband •From 116 to 348 Spinning Disk •3 Years Maitenance •Building Block approach to Growth •High Capacity Storage for Analytics and Cloud Serving •Uses 4U, 60 Drive Storage Enclosures •2TB or 4 TB Drives •A Client-Ready Petabyte in Single Rack!
© 2014 IBM Corporation Platform Computing Elastic Storage Server GS Models 38 •Smaller Configurations for High Velocity Ingest or Lower Cost Entry Point •Uses 2U, 24 Drive Storage Enclosures •400, 800 GB SSD Drives or 1.2 TB SAS Drives •Highest “Performance per U” Delivered to Clients •Deployable alone or as part of an ESS Configuration as “Platinum” tier Model GS1 24 SSD 6 GB/Sec Model GS2 46 SAS + 2 SSD or 48 SSD Drives 2 GB/Sec SAS 12 GB/Sec SSD Model GS4 94 SAS + 2 SSD or 96 SSD Drives 5 GB/Sec SAS 16 GB/Sec SSD Model GS6 142 SAS + 2 SSD 7 GB/Sec 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 5146 Machine Type •Power S822L - 20 Cores Each •Power Expansion Chassis •Red Hat 7 •Graphical User Interface •Management Server and HMC •Elastic Storage Software •Elastic Storage Native RAID •xCat or Platform Cluster Mgr. Opt. •10 Gb, 40 Gb Enet, FDR Inifiniband •3 Years Maintenance •Building Block approach to Growth
© 2014 IBM Corporation Platform Computing Elastic Storage Ensures End-to-end Data Availability, Reliability, and Integrity • GPFS Elastic Storage Native RAID (De-clustered RAID) 39 ‒ Data and parity stripes are uniformly partitioned and distributed across array ‒ Rebuilds that take days on other systems, take minutes on Elastic Storage • 2-fault and 3-fault tolerance ‒ Reed-Solomon parity encoding; 2-fault or 3-fault tolerant ‒ 3 or 4-way mirroring • End-to-end checksum & dropped write detection ‒ From disk surface to Elastic Storage user / client ‒ Detects and corrects off-track and lost / dropped disk writes • Asynchronous error diagnosis while affected I/Os continue ‒ If media error: verify and restore if possible ‒ If path problem: attempt alternate paths • Supports live replacement of disks ‒ I/O operations continue for tracks whose disks are removed during service

IBM Platform Computing Elastic Storage

  • 1.
    © 2014 IBMCorporation Platform Computing 1 IBM Platform Computing Elastic Storage Gord Sissons Platform Symphony Product Marketing Scott Campbell Platform Symphony Product Manager Rohit Valia Director, Product Marketing
  • 2.
    © 2014 IBMCorporation Platform Computing Traditional Storage 2 Filer 1 Filer 2
  • 3.
    © 2014 IBMCorporation Platform Computing Traditional Storage 3 Filer 1 Filer 2 Filer 3 Filer 4 Filer 5 Filer 6 Filer 7 Filer 8
  • 4.
    Solution: Global workloadsharing, resource balanced storage © 2014 IBM Corporation Platform Computing 4 Filer 1 Filer 2 Filer 3 Filer 4 Global Namespace with automated storage tiering
  • 5.
    Elastic Storage providesextraordinarily parallel, scale-out storage © 2014 IBM Corporation Platform Computing 5 Global Namespace with automated storage tiering
  • 6.
    Proven Reliability ©2014 IBM Corporation Platform Computing Elastic Storage – Key features 6 Extreme Scalability Maximum file system size: 1 Million Yottabytes 2^64 files per file system Maximum file size equals file system size Customers with 18 PB file systems IPv6 Futureproof Commodity hardware Snapshots, replication Built-in heartbeat, automatic failover/failback Add/remove on the fly Rolling upgrades Administer from any node Commodity hardware High Performance Parallel file access Distributed, scalable, high performance metadata Flash acceleration Automatic tiering Over 400 GB/s Commodity hardware
  • 7.
    © 2014 IBMCorporation Platform Computing Supported storage hardware In addition to IBM Storage, IBM General Parallel File System (GPFS™) supports storage hardware from these vendors: EMC Hitachi Hewlett Packard DDN 7 GPFS supports many storage systems, and the IBM support team can help customers using storage hardware solutions not on this list of tested devices.
  • 8.
    General Parallel FileSystem (GPFS™) for IBM POWER Systems™ is supported on both IBM AIX® and Linux®. © 2014 IBM Corporation Platform Computing Supported server hardware GPFS for x86 Architecture is supported on multiple x86 and AMD compatible systems:  IBM Intelligent Cluster  IBM iDataPlex®  IBM System x® rack-optimized servers  IBM BladeCenter® servers  Non-IBM x86 and AMD compatible servers 8  System p™  BladeCenter servers  IBM Blue Gene®  IBM System p® General Parallel File System (GPFS™) for x86 Architecture™ is supported on both Linux® and Windows Server 2008. GPFS for Power is supported on multiple IBM POWER platforms:
  • 9.
    © 2014 IBMCorporation Platform Computing Sharing Data Across an Organization 9 GPFS introduced concurrent file system access from multiple nodes. Multi-cluster expands the global namespace by connecting multiple sites AFM takes global namespace truly global by automatically managing asynchronous replication of data GPFS GPFS GPFS GPFS GPFS GPFS 1993 2005 2011
  • 10.
    © 2014 IBMCorporation Platform Computing Global Namespace 10 Clients access: /global/data1 /global/data2 /global/data3 /global/data4 /global/data5 /global/data6 Clients access: /global/data1 /global/data2 /global/data3 /global/data4 /global/data5 /global/data6 Clients access: /global/data1 /global/data2 /global/data3 /global/data4 /global/data5 /global/data6 File System: store1 Cache Filesets: /data1 /data2 Local Filesets: /data3 /data4 Cache Filesets: /data5 /data6 File System: store2 Local Filesets: /data1 /data2 Cache Filesets: /data3 /data4 Cache Filesets: /data5 /data6 File System: store3 Cache Filesets: /data1 /data2 Cache Filesets: /data3 /data4 Local Filesets: /data5 /data6 See all data from any Cluster Cache as much data as required or fetch data on demand
  • 11.
    © 2014 IBMCorporation Platform Computing Elastic Storage Data Life Cycle Management 11 Single Name Space Elastic Storage SSD CIFS File-System SAS SATA TSM LTFS HPSS Use Elastic Storage file set and ILM policies to control data placement, deletion and movement across storage tiers (pools)
  • 12.
    © 2014 IBMCorporation Platform Computing A Typical Hadoop HDFS Environment 12 MapReduce Cluster NFS Filers M a p R e d u c e Jobs Users H D F S  Uses disk local to each server  Aggregates the local disk space into a single, redundant shared file system  The open source standard file systems used in partnership with Hadoop MapReduce
  • 13.
    Hadoop MapReduce EnvironmentUsing Elastic Storage FPO © 2014 IBM Corporation Platform Computing 13 MapReduce Cluster NFS Filers M a p R e d u c e Jobs Users Elastic Storage FPO  Uses disk local to each server  Aggregates the local disk space into a single redundant shared file system  Designed for MapReduce workloads  Unlike HDFS, GPFS-FPO is POSIX compliant – so data maintenance is easy  Intended as a drop in replacement for open source HDFS (IBM BigInsights product may be required)
  • 14.
    © 2014 IBMCorporation Platform Computing 14 Cloud Tier (ICStore) • IBM Public Cloud • Amazon S3 • MS Azure • Private Cloud The Vision Analytics File Storage Media Data Ingest Solid State Spinning Disk Tape ESS World-Wide Data Distribution POSIX NFS MAP Reduce Object Elastic Storage • Single name space no matter where data resides • Data in best location, on the best tier (performance & cost), at the right time • Multi-tenancy • All in software
  • 15.
    © 2014 IBMCorporation Platform Computing 15
  • 16.
    © 2014 IBMCorporation Platform Computing 16 Architecture
  • 17.
    © 2014 IBMCorporation Platform Computing Elastic Storage Cluster Models 17 TCP/IP or Infiniband Network Storage Storage Storage TCP/IP or Infinband RDMA Network Storage Network TCP/IP or Infinband Network Application Nodes NSD Servers Application Nodes
  • 18.
    © 2014 IBMCorporation Platform Computing 18 Features
  • 19.
    Basics Standard ©2014 IBM Corporation Platform Computing Elastic Storage Key Features more details 19 Distributed journaled file system, scalable, high performance metadata AIX, Linux and Windows Single name space File parallel access Built-in heartbeat, automatic failover/failback, quorum Administer from any node Add/remove on the fly servers or disks Rolling upgrades Basics SNMP (running on a Linux node) Snapshots, backup, replication Filesets, quotas Active/active dual site with synchronous replication Multi-cluster Server internal disks (FPO) Flash acceleration (LROC Linux) File clones Automatic tiering (ILM), even to tapes with HSM software Geographic asynchronous caching (AFM) Clustered NFS servers (cNFS Linux) to give access beyond the elastic storage cluster Advanced Native encryption Secure deletion
  • 20.
    Elastic Storage Managesthe Full Data Lifecycle Cost Effectively © 2014 IBM Corporation Platform Computing • Policy-driven automation and tiered storage management • Match the cost of storage to the value of data • Storage pools create tiers of storage 20 Application servers Elastic Storage Server Or commodity hardware Tape Library Autotiering and Migration ‒ High performance SSD ‒ High speed SAS drives ‒ High capacity NL SAS drives • Integrated with IBM Tivoli Storage Manager (TSM) and IBM LTFS Enterprise Edition (EE) ‒ Elastic Storage handles all metadata processing then hands the data to TSM and LTFS EE for storage on tape ‒ Data is retrieved from the external storage pool on demand, as a result of an application opening a file for example ‒ Policies move data from one pool to another without changing the file’s location in the directory structure • Tape Migration Bottom Line: ‒Cuts storage costs up to 90% • Right Data • Right Place • Right Time • Right Performance • Right Cost
  • 21.
    © 2014 IBMCorporation Platform Computing Flash Local Read Only Cache (LROC) 21 Clients Flash LROC SSDs Elastic Storage • Inexpensive SSDs placed directly in Client nodes • Accelerates I/O performance up to 6x by reducing the amount of time CPUs wait for data • Also decreases the overall load on the network, benefitting performance across the board • Improves application performance while maintaining all the manageability benefits of shared storage • Cache consistency ensured by standard tokens • Data is protected by checksum and verified on read • Elastic Storage handles the flash cache automatically so data is transparently available to your application with very low latency and no code changes
  • 22.
    © 2014 IBMCorporation Platform Computing Elastic Storage : Tiering to tape with LTFS/EE • Automatic migration to tape • File user does not see where file is stored • Scales by adding tape drives or nodes • Load is balanced on nodes and drives • Tapes can be exported/imported. • Redbook : IBM Linear Tape File System Enterprise Edition V1.1 Installation and Configuration Guide sg248143 22 GPFS Node 1 Users and applications User data TSxxxx Tape Library LTFS EE GPFS Node 2 LTFS EE Global name space GPFS file systems (user data and metadata)
  • 23.
    © 2014 IBMCorporation Platform Computing 23 File Placement Optimizer (GPFS-FPO)
  • 24.
    © 2014 IBMCorporation Platform Computing Elastic Storage - FPO 24 GPFS  Use disk local to each server  All Nodes are NSD servers and NSD Clients  Designed for map reduce workloads
  • 25.
    © 2014 IBMCorporation Platform Computing Elastic Storage advanced storage for Hadoop Hadoop HDFS IBM GPFS-FPO Advantages 25 HDFS NameNode is a single point of failure Large block-sizes – poor support for small files Non-POSIX file system – obscure commands Difficulty to ingest data – special tools required Single-purpose, Hadoop MapReduce only Not recommended for critical data No single point of failure, distributed metadata Variable block sizes – suited to multiple types of data and data access patterns POSIX file system – easy to use and manage Policy based data ingest Versatile, Multi-purpose Enterprise Class advanced storage features
  • 26.
    © 2014 IBMCorporation Platform Computing 26 OpenStack
  • 27.
    OpenStack Delivers aMassively Scalable Cloud Operating System © 2014 IBM Corporation Platform Computing 27 OpenStack Mission: To produce the ubiquitous open source cloud computing platform that will meet the needs of public and private cloud providers regardless of size, by being simple to implement and massively scalable
  • 28.
    © 2014 IBMCorporation Platform Computing 28 Horizon Nova Cinder Swift Neutron Glance Keystone OpenStack Key Components
  • 29.
    © 2014 IBMCorporation Platform Computing OpenStack GPFS Cinder Driver • OpenStack Havana release includes a GPFS Cinder driver 29 – Giving architects access to the features and capabilities of the industry’s leading enterprise scale-out software defined storage • With OpenStack on GPFS, all nodes see all data – Copying data between services, like Glance to Cinder is minimized or eliminated – Speeding instance creation and conserving storage space • Rich set of data management and information lifecycle features - Volume Placement: On GPFS storage pools or FPO based placement - Resilience: Per-volume replication level, DIO volumes - Storage migration: Transparent or user-directed migration of volumes between GPFS storage pools, GPFS nodes or to other Cinder back ends - Glance Integration: Convert a volume to an image or an image to a volume through COW mechanism – fast mechanism for instance provisioning and capture
  • 30.
    © 2014 IBMCorporation Platform Computing 30 Competition
  • 31.
    © 2014 IBMCorporation Platform Computing 31 31 Business Problem IBM GPFS Lustre EMC Isilion IBRIX Fusion HDFS MAPR POSIX Interface Yes Yes Yes Yes No Yes Multi-OS Support Yes Linux only N/A No No No Hadoop FS API or location aware connector Yes No Yes Yes Yes Lifecycle Management, Tape archival Yes No No No No Global name space Yes No Yes No Distributed meta data Yes No Yes Yes Expand capability on-line Yes No WAN caching / replication Yes No No No File system snapshots Yes Quotas Yes Open source No Yes No Yes No Commercial support Yes Yes Oracle, Cray, Bull, SGI & others Yes Yes HP Yes Cloudera, IBM & others Yes IBM GPFS vs. Competitors
  • 32.
    © 2014 IBMCorporation Platform Computing 32 Elastic Storage – Editions
  • 33.
    © 2014 IBMCorporation Platform Computing Elastic Storage - New Pricing Structure 33 Server and Client for Each Socket Based Licensing • Simpler, no more PVUs Express Edition • gpfs.base (no ilm, afm, cnfs) • gpfs.docs • gpfs.gpl • gpfs.msg • gpfs.gskit Standard Edition • Add gpfs.ext Advanced Edition • Add – gpfs.crypto Platforms • zLinux • Ubuntu Features Express Edition Standard Edition Advanced Edition Basic GPFS functionality ILM: Storage pools, Policy, mmbackup Active File Management (AFM) Clustered NFS (cNFS) Encryption
  • 34.
    Client license Serverlicense © 2014 IBM Corporation Platform Computing Elastic Storage Cluster Models 34 TCP/IP or Infiniband Network Storage Storage Storage TCP/IP or Infinband RDMA Network Storage Network TCP/IP or Infinband Network Application Nodes NSD Servers Application Nodes FPO license Server license
  • 35.
    © 2014 IBMCorporation Platform Computing 35 Elastic Storage Server
  • 36.
    Replaces Specialized hardwarecontroller with software © 2014 IBM Corporation Platform Computing 36 36  Delivers Extreme Data Integrity – 2- and 3-fault-tolerant erasure codes – End-to-end checksum – Protection against lost writes – Fastest rebuild times using Declustered RAID  Breakthrough Performance – Declustered RAID reduces app load during rebuilds – Up to 3x lower overhead to applications – Built-in SSDs and NVRAM for write performance – Faster than alternatives today – and tomorrow!  Lowers TCO – 3 Years Maintenance and Support – General Purpose Servers – Off-the-shelf SBODs – Standardized in-band SES management – Standard Linux – Modular Upgrades Elastic Storage Server
  • 37.
    © 2014 IBMCorporation Platform Computing Elastic Storage Server GL Models 37 37 Model GL4 Analytics and Cloud 4 Enclosures, 20U 232 NL-SAS, 2 SSD 10+ GB/Sec Model GL6 PetaScale Storage 6 Enclosures, 28U 348 NL-SAS, 2 SSD 12+ GB/sec 5146 Machine Type Model GL2 Analytics Focused 2 Enclosures, 12U 116 NL-SAS, 2 SSD 5+ GB/Sec •Power S822L Servers •20 Cores Each •1818-80e Expansion Chassis •Red Hat 7 •Graphical User Interface •Management Server and HMC •Elastic Storage Software •Elastic Storage Native RAID •xCat or Platform Cluster Mgr. Opt. •10 Gb, 40 Gb Enet, FDR Inifiniband •From 116 to 348 Spinning Disk •3 Years Maitenance •Building Block approach to Growth •High Capacity Storage for Analytics and Cloud Serving •Uses 4U, 60 Drive Storage Enclosures •2TB or 4 TB Drives •A Client-Ready Petabyte in Single Rack!
  • 38.
    © 2014 IBMCorporation Platform Computing Elastic Storage Server GS Models 38 •Smaller Configurations for High Velocity Ingest or Lower Cost Entry Point •Uses 2U, 24 Drive Storage Enclosures •400, 800 GB SSD Drives or 1.2 TB SAS Drives •Highest “Performance per U” Delivered to Clients •Deployable alone or as part of an ESS Configuration as “Platinum” tier Model GS1 24 SSD 6 GB/Sec Model GS2 46 SAS + 2 SSD or 48 SSD Drives 2 GB/Sec SAS 12 GB/Sec SSD Model GS4 94 SAS + 2 SSD or 96 SSD Drives 5 GB/Sec SAS 16 GB/Sec SSD Model GS6 142 SAS + 2 SSD 7 GB/Sec 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 FC 5887 5146 Machine Type •Power S822L - 20 Cores Each •Power Expansion Chassis •Red Hat 7 •Graphical User Interface •Management Server and HMC •Elastic Storage Software •Elastic Storage Native RAID •xCat or Platform Cluster Mgr. Opt. •10 Gb, 40 Gb Enet, FDR Inifiniband •3 Years Maintenance •Building Block approach to Growth
  • 39.
    © 2014 IBMCorporation Platform Computing Elastic Storage Ensures End-to-end Data Availability, Reliability, and Integrity • GPFS Elastic Storage Native RAID (De-clustered RAID) 39 ‒ Data and parity stripes are uniformly partitioned and distributed across array ‒ Rebuilds that take days on other systems, take minutes on Elastic Storage • 2-fault and 3-fault tolerance ‒ Reed-Solomon parity encoding; 2-fault or 3-fault tolerant ‒ 3 or 4-way mirroring • End-to-end checksum & dropped write detection ‒ From disk surface to Elastic Storage user / client ‒ Detects and corrects off-track and lost / dropped disk writes • Asynchronous error diagnosis while affected I/Os continue ‒ If media error: verify and restore if possible ‒ If path problem: attempt alternate paths • Supports live replacement of disks ‒ I/O operations continue for tracks whose disks are removed during service