Analytics with R in SQL Server 2016 Chennai SQL Server User Group
Hariharan Lead Consultant, Your SQL Man (I) Pvt Ltd Microsoft Certified Trainer Microsoft Certified Solution Expert Business Intelligence & Data Management and Analytics SAP HANA & Business Objects Active Speaker in DAGEOP DATA DAY Blog http://dataap.org/author/hariharanr/ Twitter @imhariharanr Linked In hariharan-r-12635640
Topics • Analytics • Introduction to R • Challenges • R in SQL Server 2016 • Advanced Analytics in R • Visualization with R 3
Analytics What Happened? Why Happened? What Will happen? What should I do? Descriptive Diagnostic Predictive Prescriptive
Introduction to R • It is a Open Source • A statistics programming language • A data visualization tool • It has a very big community with 2.5+ M users • Scalable to big data • Rich application & platform integration
Introduction to R 0 500 1000 1500 2000 2500 3000 3500 4000 2011 2012 2013 2014 2015 2016 R Packages in CRAN
Demo
Challenges in R • Data Movement • Moving data from database to R • Operationalization • How do I call R script from my application • Scale and Performance • R runs in single threaded and only accommodates datasets that fit into available memory
SQL Server with R • Data Movement • Reduce or eliminate data movement with In-Database analytics • Operationalization • T-SQL Stored Procedure • Scale and Performance • In-memory, columnstore indexes. Leverage RevoScaleR support large datasets and parallel algorithms
SQL Server with R • It a new feature in SQL Server 2016 • New Workload • Integrated highly popular R language for enterprise customers • Intelligent & Predictive Applications
R Components
R Components DeployR • RESTful APIs for easy integration from java, • JavaScript, .NET • Enterprise authentication & security • Horizontal Scaling ConnectR • High-speed & direct connectors Available for: • High performance XDF • SAS, SPSS, delimited & fixed format text data files • Hadoop HDFS (text & XDF) • Teradata Database & Aster • EDWs and ADWs • ODBC DevelopR • Develop R using familiar tools • RTVS • R Studio • R Client SclaeR • Ready-to-use high performance big data analytics • Fully-parallelized analytics • Data prep & data distillation • Descriptive statistics & statistical tests • Range of predictive functions • Wide data sets supported – thousands of variables DistributedR • Distributed computing framework • Delivers cross-platform portability CRAN • Open source R interpreter • Freely available huge range of R algorithms (packages) • Huge community of users Microsoft R Open • Based on open source R • High performance math library to speed up linear algebra functions • Checkpoint package to easily share R code and replicate results using specific R package versions
Tools • Microsoft R Client • Microsoft R Open • Visual Studio R Tools • R Studio • R GUI
End to End Scenario Developer DBA Data Engineer Data Scientist Data Exploration and Predicative Modeling Operationalizing the R Code Managing my server Authoring workflows
SQL Server with R
SQL Server with R
SQL Server with R
Demo
Advanced Analytics - R & Data Optimization SQL Server R Client • ScaleR Library • Compute Context • Parallel Processing • Algorithm Parameters
Flow
Visualization http://www.r-graph-gallery.com/all-graphs/
Demo
Thank You

Analytics with R in SQL Server 2016

  • 1.
    Analytics with Rin SQL Server 2016 Chennai SQL Server User Group
  • 2.
    Hariharan Lead Consultant, YourSQL Man (I) Pvt Ltd Microsoft Certified Trainer Microsoft Certified Solution Expert Business Intelligence & Data Management and Analytics SAP HANA & Business Objects Active Speaker in DAGEOP DATA DAY Blog http://dataap.org/author/hariharanr/ Twitter @imhariharanr Linked In hariharan-r-12635640
  • 3.
    Topics • Analytics • Introductionto R • Challenges • R in SQL Server 2016 • Advanced Analytics in R • Visualization with R 3
  • 4.
    Analytics What Happened? Why Happened? What Will happen? Whatshould I do? Descriptive Diagnostic Predictive Prescriptive
  • 5.
    Introduction to R •It is a Open Source • A statistics programming language • A data visualization tool • It has a very big community with 2.5+ M users • Scalable to big data • Rich application & platform integration
  • 6.
    Introduction to R 0 500 1000 1500 2000 2500 3000 3500 4000 20112012 2013 2014 2015 2016 R Packages in CRAN
  • 7.
  • 8.
    Challenges in R •Data Movement • Moving data from database to R • Operationalization • How do I call R script from my application • Scale and Performance • R runs in single threaded and only accommodates datasets that fit into available memory
  • 9.
    SQL Server withR • Data Movement • Reduce or eliminate data movement with In-Database analytics • Operationalization • T-SQL Stored Procedure • Scale and Performance • In-memory, columnstore indexes. Leverage RevoScaleR support large datasets and parallel algorithms
  • 10.
    SQL Server withR • It a new feature in SQL Server 2016 • New Workload • Integrated highly popular R language for enterprise customers • Intelligent & Predictive Applications
  • 11.
  • 12.
    R Components DeployR • RESTfulAPIs for easy integration from java, • JavaScript, .NET • Enterprise authentication & security • Horizontal Scaling ConnectR • High-speed & direct connectors Available for: • High performance XDF • SAS, SPSS, delimited & fixed format text data files • Hadoop HDFS (text & XDF) • Teradata Database & Aster • EDWs and ADWs • ODBC DevelopR • Develop R using familiar tools • RTVS • R Studio • R Client SclaeR • Ready-to-use high performance big data analytics • Fully-parallelized analytics • Data prep & data distillation • Descriptive statistics & statistical tests • Range of predictive functions • Wide data sets supported – thousands of variables DistributedR • Distributed computing framework • Delivers cross-platform portability CRAN • Open source R interpreter • Freely available huge range of R algorithms (packages) • Huge community of users Microsoft R Open • Based on open source R • High performance math library to speed up linear algebra functions • Checkpoint package to easily share R code and replicate results using specific R package versions
  • 13.
    Tools • Microsoft RClient • Microsoft R Open • Visual Studio R Tools • R Studio • R GUI
  • 14.
    End to EndScenario Developer DBA Data Engineer Data Scientist Data Exploration and Predicative Modeling Operationalizing the R Code Managing my server Authoring workflows
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
    Advanced Analytics -R & Data Optimization SQL Server R Client • ScaleR Library • Compute Context • Parallel Processing • Algorithm Parameters
  • 20.
  • 21.
  • 22.
  • 23.