DATA VIRTUALIZATION PACKED LUNCH WEBINAR SERIES Sessions Covering Key Data Integration Challenges Solved with Data Virtualization
Self-Service Analytics with Guardrails Saptarshi Sengupta Principal Product Marketing Manager, Denodo Ed Robbins Sales Engineer, Denodo
Agenda 1. Tackling Oil Price Slump @ Anadarko 2. Data Virtualization for Self-Service Analytics 3. Decision Support Initiative @ Indiana University 4. Product Demonstration 5. Q&A 6. Next Steps
Tackling Oil Price Slump @ Anadarko 4
5 Self-Service Data Delivery Environment Scope • Shared/managed environment for data producers and consumers • Corporate and non-corporate data source mash-ups • Responsive delivery of data products with real-time data access • Bridged data environments, across technology and business domains Implementation • Branded data virtualization implementation using the Denodo Platform • Included: • Governance (e.g. data request process) • Data Catalog (for end-users) • Drivers (e.g. for BI & analytics tool integration) • VDP Client (for data engineers and analysts) • VDP Server (with optimized data sources) To create and use data services for analytics, reports, and apps
6 Data Architecture at Anadarko Data Sources Iot/Edge Sensor Data Machine Data Internet Data Images and Video Enterprise Structured Data Sources Unstructured Content Cloud FTP Databases Web Services Processing Events (real-time) Virtualize (real-time) Streams (real-time) Change Data Capture (real-time) ETL (batch) Data Ingestion Streams (real-time) Change Data Capture (real-time) ETL (batch) Data Integration Data Lake Batch DW NoSQL Hadoop YARN/Workload Management HDFS Data Environment Data Compute CPU/GPU/TPU Data Cache In-Memory Data Warehouse EDW In-Memory Data Mart ODS Historian Data Virtualization Federation Abstraction Data Services Optimization Security Governance Analytics Predictive Analytics Statistical Analytics Text Analytics Data Mining Data Insights Data Access Data Discovery Self-Service Search Aplications Real-time Decision Management Alerts Reporting Dashboards/Ad-hoc Canned Metadata Management, Data Governance, Data Security
Data Virtualization for Self-Service Analytics 7
8 IT – Business Dilemma IT Architecture is Unmanageable & Brittle because: IT Focuses on Data Collection & Storage Business Focuses on Data Visualization & Analysis No One Focused on Data Delivery – So create 100’s to 1K’s of brittle direct connections and replicate large volumes of data Inventory System (MS SQL Server) Product Catalog (Web Service -SOAP) BI / Reporting JDBC, ODBC, ADO .NET Web / Mobile WS – REST JSON, XML, HTML, RSS MS Excel Denodo Excel Add-in Log files (.txt/.log files) CRM (MySQL) Billing System (Web Service - Rest) Big Data, Cloud (Hadoop, Web) Product Data (CSV) E T L Portals JSR168 / 286, Ms Web Parts SOA, Middleware, Enterprise Apps WS – SOAP Java API Customer Voice (Internet, Unstruc)
9 IT and Business Going in Different Directions BI Benchmark Report High Cost - IT spends ~1% of Revenue on ETL & Storage ▪ 75% of data stored is not used – large £ wasted ▪ 90% of all queries are for Current data ▪ not available from traditional EDW or data lakes Long Time – Months to Build ETL Process & DataMarts ▪ 2+ Months to add new data source to an EDW ▪ 1 – 2 Months to build complex dashboard or report IT Slowing Down By2020 ▪ 500% growth in Data & Device Avalanche ▪ Due to lack of data accessibility today < 0.5% of all data is ever analyzed and used Source: Business Speeding Up To remain competitive, by 2020, Business Decision Speed & Analysis Sophistication Requires 300% Increase Source:
10 The Promise of Self-Service Initiatives • Let business users access the data that they need and stop IT being a bottleneck • That’s the vision as sold by many BI tool vendors • i.e. give me the tools and access to the data and stand back ☺
11 • First wave of self-service initiatives was driven by ‘shadow IT’ and spreadsheets • More recently using desktop analytics tools • Tableau, Qlik, Trifacta, … • Do these initiatives really work in practice? Self-Service Initiatives
12 Self-Service Issues… • Tools are designed for data analysts (or power users) • Users who are happy finding, wrangling, cleansing data • Creating calculations, aggregations within the data • What about the other business users? • People who don’t want to spend hours fighting the spreadsheet… • Spreadsheets and desktop tools are isolated • Sitting on one desktop or shared via email • Ultimately, can you trust the numbers? • Where did the data come from? How has is been manipulated?
Rob van der Meulen, Gartner Gartner predicts that by 2018 most business users will have access to self-service tools, but that only one in 10 initiatives will be sufficiently well-governed to avoid data inconsistencies that negatively impact the business.
Building a Platform for Self- Service Analytics 14
15 Self-Service with Guardrails • Don’t build just for the ‘data cowboys’ • Create pre-integrated, pre-calculated data services • Saves the user having to do this themselves • Ensures consistency of calculations, etc. • But allow the cowboys to ‘roam and wrangle’ • Even the cowboys can only access ‘approved’ data sources
16 A Few Simple Rules… 1. Users come in all shapes and sizes • Who are they? What data do they need? What flexibility do they want? 2. Connect to all of the data (but start with the most important) • What data is needed by the users? Open access or pre-aggregated and pre- calculated? 3. Use the language that the business understands • e.g. to Finance it’s an ‘account’, but to Customer Care it’s a ‘customer’. Don’t force people to change terminology…support multiple semantic mappings (to the language of the consumer)
17 IT: Flexible Source Architecture Business: Flexible Tool Choice IT can now move at slower speed w/o affecting business Business can now make faster & more sophisticated decisions as all data accessible by any tool of choice BI and Analytics Reference Architecture
Decision Support Initiative @ Indiana University 18
19 Decision Support Initiative at IU Indiana University Self-Service Portal for DSI
20 Architecture Diagram Decision Support Initiative at IU
Product Demonstration Accelerate Self-Service Analytics with a Universal Semantic Model 21 Sales Engineer, Denodo Edwin Robbins
22 The true potential of Self-Service Analytics • Companies have always been challenged to deliver data to their end-users faster • Business users are waiting on BI Developers to deliver dashboards • BI Developers are waiting on ETL to load data in a warehouse • Data Scientists need access to all data and they want it in the (raw) detail forma • The typical approach to this challenge is to build a Data Lake • Often this results is a vast data store with no overriding metadata • Cryptic column names, no defined relationships between different Data Sets • Solution – Build a Virtual Data Lake with Denodo • Faster and cheaper to deploy along with enterprise level metadata defining data relationships • Allow end users true self-service analytics…but with guard rails
Demo 23
24 Summary – Key Takeaways • Data Virtualization provides a common and consistent view of data across organization • No more arguments about data sources and veracity ☺ • Data Virtualization provides a platform for self-service with guardrails • Supports both ‘data cowboys’ (with limits) and regular business users • Accelerates self-service initiatives – no more analysis silos – while retaining control and governance
Q&A
Next steps Download Denodo Express: www.denodoexpress.com Access Denodo Platform in the Cloud! 30 day FREE trial available! Denodo for Azure: www.denodo.com/TrialAzure/PackedLunch Denodo for AWS: www.denodo.com/TrialAWS/PackedLunch
Next session Data Virtualization – An Introduction Thursday, July 19, 2017 | 11:00am PT | 2:00pm ET Paul Moxon VP Data Architectures & Chief Evangelist, Denodo
Thank you! © Copyright Denodo Technologies. All rights reserved Unless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and microfilm, without prior the written authorization from Denodo Technologies.

Self-Service Analytics with Guard Rails

  • 1.
    DATA VIRTUALIZATION PACKEDLUNCH WEBINAR SERIES Sessions Covering Key Data Integration Challenges Solved with Data Virtualization
  • 2.
    Self-Service Analytics withGuardrails Saptarshi Sengupta Principal Product Marketing Manager, Denodo Ed Robbins Sales Engineer, Denodo
  • 3.
    Agenda 1. Tackling OilPrice Slump @ Anadarko 2. Data Virtualization for Self-Service Analytics 3. Decision Support Initiative @ Indiana University 4. Product Demonstration 5. Q&A 6. Next Steps
  • 4.
    Tackling Oil PriceSlump @ Anadarko 4
  • 5.
    5 Self-Service Data DeliveryEnvironment Scope • Shared/managed environment for data producers and consumers • Corporate and non-corporate data source mash-ups • Responsive delivery of data products with real-time data access • Bridged data environments, across technology and business domains Implementation • Branded data virtualization implementation using the Denodo Platform • Included: • Governance (e.g. data request process) • Data Catalog (for end-users) • Drivers (e.g. for BI & analytics tool integration) • VDP Client (for data engineers and analysts) • VDP Server (with optimized data sources) To create and use data services for analytics, reports, and apps
  • 6.
    6 Data Architecture atAnadarko Data Sources Iot/Edge Sensor Data Machine Data Internet Data Images and Video Enterprise Structured Data Sources Unstructured Content Cloud FTP Databases Web Services Processing Events (real-time) Virtualize (real-time) Streams (real-time) Change Data Capture (real-time) ETL (batch) Data Ingestion Streams (real-time) Change Data Capture (real-time) ETL (batch) Data Integration Data Lake Batch DW NoSQL Hadoop YARN/Workload Management HDFS Data Environment Data Compute CPU/GPU/TPU Data Cache In-Memory Data Warehouse EDW In-Memory Data Mart ODS Historian Data Virtualization Federation Abstraction Data Services Optimization Security Governance Analytics Predictive Analytics Statistical Analytics Text Analytics Data Mining Data Insights Data Access Data Discovery Self-Service Search Aplications Real-time Decision Management Alerts Reporting Dashboards/Ad-hoc Canned Metadata Management, Data Governance, Data Security
  • 7.
  • 8.
    8 IT – BusinessDilemma IT Architecture is Unmanageable & Brittle because: IT Focuses on Data Collection & Storage Business Focuses on Data Visualization & Analysis No One Focused on Data Delivery – So create 100’s to 1K’s of brittle direct connections and replicate large volumes of data Inventory System (MS SQL Server) Product Catalog (Web Service -SOAP) BI / Reporting JDBC, ODBC, ADO .NET Web / Mobile WS – REST JSON, XML, HTML, RSS MS Excel Denodo Excel Add-in Log files (.txt/.log files) CRM (MySQL) Billing System (Web Service - Rest) Big Data, Cloud (Hadoop, Web) Product Data (CSV) E T L Portals JSR168 / 286, Ms Web Parts SOA, Middleware, Enterprise Apps WS – SOAP Java API Customer Voice (Internet, Unstruc)
  • 9.
    9 IT and BusinessGoing in Different Directions BI Benchmark Report High Cost - IT spends ~1% of Revenue on ETL & Storage ▪ 75% of data stored is not used – large £ wasted ▪ 90% of all queries are for Current data ▪ not available from traditional EDW or data lakes Long Time – Months to Build ETL Process & DataMarts ▪ 2+ Months to add new data source to an EDW ▪ 1 – 2 Months to build complex dashboard or report IT Slowing Down By2020 ▪ 500% growth in Data & Device Avalanche ▪ Due to lack of data accessibility today < 0.5% of all data is ever analyzed and used Source: Business Speeding Up To remain competitive, by 2020, Business Decision Speed & Analysis Sophistication Requires 300% Increase Source:
  • 10.
    10 The Promise ofSelf-Service Initiatives • Let business users access the data that they need and stop IT being a bottleneck • That’s the vision as sold by many BI tool vendors • i.e. give me the tools and access to the data and stand back ☺
  • 11.
    11 • First waveof self-service initiatives was driven by ‘shadow IT’ and spreadsheets • More recently using desktop analytics tools • Tableau, Qlik, Trifacta, … • Do these initiatives really work in practice? Self-Service Initiatives
  • 12.
    12 Self-Service Issues… • Toolsare designed for data analysts (or power users) • Users who are happy finding, wrangling, cleansing data • Creating calculations, aggregations within the data • What about the other business users? • People who don’t want to spend hours fighting the spreadsheet… • Spreadsheets and desktop tools are isolated • Sitting on one desktop or shared via email • Ultimately, can you trust the numbers? • Where did the data come from? How has is been manipulated?
  • 13.
    Rob van derMeulen, Gartner Gartner predicts that by 2018 most business users will have access to self-service tools, but that only one in 10 initiatives will be sufficiently well-governed to avoid data inconsistencies that negatively impact the business.
  • 14.
    Building a Platformfor Self- Service Analytics 14
  • 15.
    15 Self-Service with Guardrails •Don’t build just for the ‘data cowboys’ • Create pre-integrated, pre-calculated data services • Saves the user having to do this themselves • Ensures consistency of calculations, etc. • But allow the cowboys to ‘roam and wrangle’ • Even the cowboys can only access ‘approved’ data sources
  • 16.
    16 A Few SimpleRules… 1. Users come in all shapes and sizes • Who are they? What data do they need? What flexibility do they want? 2. Connect to all of the data (but start with the most important) • What data is needed by the users? Open access or pre-aggregated and pre- calculated? 3. Use the language that the business understands • e.g. to Finance it’s an ‘account’, but to Customer Care it’s a ‘customer’. Don’t force people to change terminology…support multiple semantic mappings (to the language of the consumer)
  • 17.
    17 IT: Flexible SourceArchitecture Business: Flexible Tool Choice IT can now move at slower speed w/o affecting business Business can now make faster & more sophisticated decisions as all data accessible by any tool of choice BI and Analytics Reference Architecture
  • 18.
    Decision Support Initiative @Indiana University 18
  • 19.
    19 Decision Support Initiativeat IU Indiana University Self-Service Portal for DSI
  • 20.
  • 21.
    Product Demonstration Accelerate Self-ServiceAnalytics with a Universal Semantic Model 21 Sales Engineer, Denodo Edwin Robbins
  • 22.
    22 The true potentialof Self-Service Analytics • Companies have always been challenged to deliver data to their end-users faster • Business users are waiting on BI Developers to deliver dashboards • BI Developers are waiting on ETL to load data in a warehouse • Data Scientists need access to all data and they want it in the (raw) detail forma • The typical approach to this challenge is to build a Data Lake • Often this results is a vast data store with no overriding metadata • Cryptic column names, no defined relationships between different Data Sets • Solution – Build a Virtual Data Lake with Denodo • Faster and cheaper to deploy along with enterprise level metadata defining data relationships • Allow end users true self-service analytics…but with guard rails
  • 23.
  • 24.
    24 Summary – KeyTakeaways • Data Virtualization provides a common and consistent view of data across organization • No more arguments about data sources and veracity ☺ • Data Virtualization provides a platform for self-service with guardrails • Supports both ‘data cowboys’ (with limits) and regular business users • Accelerates self-service initiatives – no more analysis silos – while retaining control and governance
  • 25.
  • 26.
    Next steps Download DenodoExpress: www.denodoexpress.com Access Denodo Platform in the Cloud! 30 day FREE trial available! Denodo for Azure: www.denodo.com/TrialAzure/PackedLunch Denodo for AWS: www.denodo.com/TrialAWS/PackedLunch
  • 27.
    Next session Data Virtualization– An Introduction Thursday, July 19, 2017 | 11:00am PT | 2:00pm ET Paul Moxon VP Data Architectures & Chief Evangelist, Denodo
  • 28.
    Thank you! © CopyrightDenodo Technologies. All rights reserved Unless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and microfilm, without prior the written authorization from Denodo Technologies.