PERFORMANCE UPDATE: WHEN APACHE ORC MET APACHE SPARK Dongjoon Hyun - dhyun@hortonworks.com Owen O'Malley - owen@hortonworks.com @owen_omalley
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda Performance Update Spark – Apache ORC 1.4 Integration Benchmark Overview Results Roadmap
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved ORC History  Originally released as part of Hive – Released in Hive 0.11.0 (2013-05-16) – Included in each release before Hive 2.3.0 (2017-07-17)  Factored out of Hive – Improve integration with other tools – Shrink the size of the dependencies – Releases faster than Hive – Added C++ reader and new C++ writer
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Integration History  Before Apache ORC – Hive 1.2.1 (2015-06-27)  SPARK-2883 Since Spark 1.4, Hive ORC is used.  After Apache ORC – v1.0.0 (2016-01-25) – v1.1.0 (2016-06-10) – v1.2.0 (2016-08-25) – v1.3.0 (2017-01-23) – v1.4.0 (2017-05-08)  SPARK-21422 For Spark 2.3, Apache ORC dependency is added and used.
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Integration Design (in HDP 2.6.3 with Spark 2.2)  Switch between old and new formats by configuration FileFormat TextBasedFileFormat ParquetFileFormat OrcFileFormat HiveFileFormat JsonFileFormat LibSVMFileFormat CSVFileFormat TextFileFormat o.a.s.sql.execution.datasources o.a.s.ml.source.libsvmo.a.s.sql.hive.orc OrcFileFormat Old OrcFileFormat from Hive 1.2.1 (SPARK-2883) New OrcFileFormat with ORC 1.4.0 (SPARK-21422, SPARK-20682, SPARK-20728)
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Benefit in Apache Spark  Speed – Use both Spark ColumnarBatch and ORC RowBatch together more seamlessly  Stability – Apache ORC 1.4.0 has many fixes and we can depend on ORC community more.  Usability – User can use ORC data sources without hive module, i.e, -Phive.  Maintainability – Reduce the Hive dependency and can remove old legacy code later.
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Faster and More Scalable Number of columns Time (ms) Single Column Scan from Wide Tables (1M rows with all BIGINT columns) 0 500 1000 1500 2000 TINYINT SMALLINT INT BIGINT FLOAT DOULBE OLD NEW Vectorized Read (15M rows in a single-column table) Time (ms) 0 200 400 600 800 100 200 300 NEW OLD
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Support Matrix  HDP 2.6.3 is going to add a new faster and stable ORC file format for ORC tables with the subset of limitations of Apache Spark 2.2.  The followings are not supported yet – Zero-byte ORC File – Schema Evolution • Adding columns at the end • Changing types and deleting columns  Please see the full JIRA issue list in next slides.
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Done Tickets  SPARK-20901 Feature parity for ORC with Parquet – The parent ticket of all ORC issues  Done – SPARK-20566 ColumnVector should support `appendFloats` for array – SPARK-21422 Depend on Apache ORC 1.4.0 – SPARK-21831 Remove `spark.sql.hive.convertMetastoreOrc` config in HiveCompatibilitySuite – SPARK-21839 Support SQL config for ORC compression – SPARK-21884 Fix StackOverflowError on MetadataOnlyQuery – SPARK-21912 ORC/Parquet table should not create invalid column names
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved On-Going Tickets  SPARK-20682 Support a new faster ORC data source based on Apache ORC  SPARK-20728 Make ORCFileFormat configurable between sql/hive and sql/core  SPARK-16060 Vectorized Orc reader  SPARK-21791 ORC should support column names with dot  SPARK-21787 Support for pushing down filters for DATE types in ORC  SPARK-19809 Zero byte ORC file support  SPARK-14387 Enable Hive-1.x ORC compatibility with spark.sql.hive.convertMetastoreOrc
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved To-do Tickets  Configuration – SPARK-21783 Turn on ORC filter push-down by default  Read – SPARK-11412 Support merge schema for ORC – SPARK-16628 OrcConversions should not convert an ORC table  Alter – SPARK-21929 Support `ALTER TABLE ADD COLUMNS(..)` for ORC data source – SPARK-18355 Spark SQL fails to read from a ORC table with new column  Write – SPARK-12417 Orc bloom filter options are not propagated during file write
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda Performance Update Spark – Apache ORC 1.4 Integration Benchmark Overview Results Roadmap
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Benchmark Overview • Benchmark Objective • Compare performance of New ORC vs Old ORC in Spark • With TPC-DS at different scales 1 TB, 10 TB
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Enabling New ORC in Spark  spark.sql.hive.convertMetastoreOrc=true  spark.sql.orc.enabled=true
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Software/Hardware/Cluster Details  Software – Internal branch of Spark 2.2 (in HDP)  Node Configuration – 32 CPU, Intel E5-2640, 2.00GHz – 256 GB RAM – 6 SATA Disks (~4 TB, 7200 rpm) – 10 Gbps network card  Cluster – 10 nodes used for 1 TB – 15 nodes used for 10 TB – Double type used in data – TPC-DS (1.4) queries in Spark used for benchmarking
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Spark Tuning Spark Parameter Name Default Value Tuned Value spark.sql.hive.convertMetastoreOrc false true spark.sql.orc.enabled false true spark.sql.orc.filterPushdown false true spark.sql.statistics.fallBackToHdfs false true spark.sql.autoBroadcastJoinThreshold 10L * 1024 * 1024 (10MB) 26214400 spark.shuffle.io.numConnectionsPerPeer 1 10 spark.io.compression.lz4.blockSize 32k 128kb spark.sql.shuffle.partitions 200 300 spark.network.timeout 120s 600s spark.locality.wait 3s 0s *Hive Parameter Name (hive-site.xml) Default Value Tuned Value hive.exec.max.dynamic.partitions 1000 10000 hive.exec.dynamic.partition.mode strict nonstrict *mainly for data ingestion *all tables analyzed with `noscan` option for getting basic stats
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda Performance Update Spark – Apache ORC 1.4 Integration Benchmark Overview Results Roadmap
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Results : 10 TB Scale (15 executors, 170 GB, 25 cores) * Total runtime of 74 queries in TPC-DS - New ORC is > 2x better than old ORC
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Results : 10 TB Scale (15 executors, 170 GB, 25 cores) New ORC vs Old ORC New ORC consistently outperforms old ORC in all queries significantly
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Results : 1 TB Scale (10 executors, 170 GB, 25 cores) * Total runtime of 97 queries in TPC-DS @ 1 TB scale • New ORC is ~2x faster than old ORC
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved *Profiler shows high CPU usage with libzip.so during ORC reads. ORC-175 can be used for improving this further *ORC-175 (Intel ISAL: intelligent storage acceleration libraries for inflate) can be useful here
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Comparison of New ORC vs Old ORC vs Parquet (10 TB Scale) * Total runtime of 74 queries in TPC-DS * 10 TB Scale (15 executors, 170 GB, 25 cores)
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda Performance Update Spark – Apache ORC 1.4 Integration Benchmark Overview Results Roadmap
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved RoadMap  Tune default ambari configs for spark based on these benchmark results to get better OOTB experience for end users  Additional enhancements like parallel reading of footers in ORC  Include ORC-175 (Intel ISAL) when complete  Contribute back the fixes back to community
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Thanks!  Questions – For ORC – dev@@orc.apache.org – For Spark – dev@spark.apache.org – Benchmarks – dhyun@hortonworks.com
26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Results : 10 TB Scale (15 executors, 170 GB, 25 cores) Comparison of New ORC vs Parquet

Performance Update: When Apache ORC Met Apache Spark

  • 1.
    PERFORMANCE UPDATE: WHEN APACHEORC MET APACHE SPARK Dongjoon Hyun - dhyun@hortonworks.com Owen O'Malley - owen@hortonworks.com @owen_omalley
  • 2.
    2 © HortonworksInc. 2011 – 2016. All Rights Reserved Agenda Performance Update Spark – Apache ORC 1.4 Integration Benchmark Overview Results Roadmap
  • 3.
    3 © HortonworksInc. 2011 – 2016. All Rights Reserved ORC History  Originally released as part of Hive – Released in Hive 0.11.0 (2013-05-16) – Included in each release before Hive 2.3.0 (2017-07-17)  Factored out of Hive – Improve integration with other tools – Shrink the size of the dependencies – Releases faster than Hive – Added C++ reader and new C++ writer
  • 4.
    4 © HortonworksInc. 2011 – 2016. All Rights Reserved Integration History  Before Apache ORC – Hive 1.2.1 (2015-06-27)  SPARK-2883 Since Spark 1.4, Hive ORC is used.  After Apache ORC – v1.0.0 (2016-01-25) – v1.1.0 (2016-06-10) – v1.2.0 (2016-08-25) – v1.3.0 (2017-01-23) – v1.4.0 (2017-05-08)  SPARK-21422 For Spark 2.3, Apache ORC dependency is added and used.
  • 5.
    5 © HortonworksInc. 2011 – 2016. All Rights Reserved Integration Design (in HDP 2.6.3 with Spark 2.2)  Switch between old and new formats by configuration FileFormat TextBasedFileFormat ParquetFileFormat OrcFileFormat HiveFileFormat JsonFileFormat LibSVMFileFormat CSVFileFormat TextFileFormat o.a.s.sql.execution.datasources o.a.s.ml.source.libsvmo.a.s.sql.hive.orc OrcFileFormat Old OrcFileFormat from Hive 1.2.1 (SPARK-2883) New OrcFileFormat with ORC 1.4.0 (SPARK-21422, SPARK-20682, SPARK-20728)
  • 6.
    6 © HortonworksInc. 2011 – 2016. All Rights Reserved Benefit in Apache Spark  Speed – Use both Spark ColumnarBatch and ORC RowBatch together more seamlessly  Stability – Apache ORC 1.4.0 has many fixes and we can depend on ORC community more.  Usability – User can use ORC data sources without hive module, i.e, -Phive.  Maintainability – Reduce the Hive dependency and can remove old legacy code later.
  • 7.
    7 © HortonworksInc. 2011 – 2016. All Rights Reserved Faster and More Scalable Number of columns Time (ms) Single Column Scan from Wide Tables (1M rows with all BIGINT columns) 0 500 1000 1500 2000 TINYINT SMALLINT INT BIGINT FLOAT DOULBE OLD NEW Vectorized Read (15M rows in a single-column table) Time (ms) 0 200 400 600 800 100 200 300 NEW OLD
  • 8.
    8 © HortonworksInc. 2011 – 2016. All Rights Reserved Support Matrix  HDP 2.6.3 is going to add a new faster and stable ORC file format for ORC tables with the subset of limitations of Apache Spark 2.2.  The followings are not supported yet – Zero-byte ORC File – Schema Evolution • Adding columns at the end • Changing types and deleting columns  Please see the full JIRA issue list in next slides.
  • 9.
    9 © HortonworksInc. 2011 – 2016. All Rights Reserved Done Tickets  SPARK-20901 Feature parity for ORC with Parquet – The parent ticket of all ORC issues  Done – SPARK-20566 ColumnVector should support `appendFloats` for array – SPARK-21422 Depend on Apache ORC 1.4.0 – SPARK-21831 Remove `spark.sql.hive.convertMetastoreOrc` config in HiveCompatibilitySuite – SPARK-21839 Support SQL config for ORC compression – SPARK-21884 Fix StackOverflowError on MetadataOnlyQuery – SPARK-21912 ORC/Parquet table should not create invalid column names
  • 10.
    10 © HortonworksInc. 2011 – 2016. All Rights Reserved On-Going Tickets  SPARK-20682 Support a new faster ORC data source based on Apache ORC  SPARK-20728 Make ORCFileFormat configurable between sql/hive and sql/core  SPARK-16060 Vectorized Orc reader  SPARK-21791 ORC should support column names with dot  SPARK-21787 Support for pushing down filters for DATE types in ORC  SPARK-19809 Zero byte ORC file support  SPARK-14387 Enable Hive-1.x ORC compatibility with spark.sql.hive.convertMetastoreOrc
  • 11.
    11 © HortonworksInc. 2011 – 2016. All Rights Reserved To-do Tickets  Configuration – SPARK-21783 Turn on ORC filter push-down by default  Read – SPARK-11412 Support merge schema for ORC – SPARK-16628 OrcConversions should not convert an ORC table  Alter – SPARK-21929 Support `ALTER TABLE ADD COLUMNS(..)` for ORC data source – SPARK-18355 Spark SQL fails to read from a ORC table with new column  Write – SPARK-12417 Orc bloom filter options are not propagated during file write
  • 12.
    12 © HortonworksInc. 2011 – 2016. All Rights Reserved Agenda Performance Update Spark – Apache ORC 1.4 Integration Benchmark Overview Results Roadmap
  • 13.
    13 © HortonworksInc. 2011 – 2016. All Rights Reserved Benchmark Overview • Benchmark Objective • Compare performance of New ORC vs Old ORC in Spark • With TPC-DS at different scales 1 TB, 10 TB
  • 14.
    14 © HortonworksInc. 2011 – 2016. All Rights Reserved Enabling New ORC in Spark  spark.sql.hive.convertMetastoreOrc=true  spark.sql.orc.enabled=true
  • 15.
    15 © HortonworksInc. 2011 – 2016. All Rights Reserved Software/Hardware/Cluster Details  Software – Internal branch of Spark 2.2 (in HDP)  Node Configuration – 32 CPU, Intel E5-2640, 2.00GHz – 256 GB RAM – 6 SATA Disks (~4 TB, 7200 rpm) – 10 Gbps network card  Cluster – 10 nodes used for 1 TB – 15 nodes used for 10 TB – Double type used in data – TPC-DS (1.4) queries in Spark used for benchmarking
  • 16.
    16 © HortonworksInc. 2011 – 2016. All Rights Reserved Spark Tuning Spark Parameter Name Default Value Tuned Value spark.sql.hive.convertMetastoreOrc false true spark.sql.orc.enabled false true spark.sql.orc.filterPushdown false true spark.sql.statistics.fallBackToHdfs false true spark.sql.autoBroadcastJoinThreshold 10L * 1024 * 1024 (10MB) 26214400 spark.shuffle.io.numConnectionsPerPeer 1 10 spark.io.compression.lz4.blockSize 32k 128kb spark.sql.shuffle.partitions 200 300 spark.network.timeout 120s 600s spark.locality.wait 3s 0s *Hive Parameter Name (hive-site.xml) Default Value Tuned Value hive.exec.max.dynamic.partitions 1000 10000 hive.exec.dynamic.partition.mode strict nonstrict *mainly for data ingestion *all tables analyzed with `noscan` option for getting basic stats
  • 17.
    17 © HortonworksInc. 2011 – 2016. All Rights Reserved Agenda Performance Update Spark – Apache ORC 1.4 Integration Benchmark Overview Results Roadmap
  • 18.
    18 © HortonworksInc. 2011 – 2016. All Rights Reserved Results : 10 TB Scale (15 executors, 170 GB, 25 cores) * Total runtime of 74 queries in TPC-DS - New ORC is > 2x better than old ORC
  • 19.
    19 © HortonworksInc. 2011 – 2016. All Rights Reserved Results : 10 TB Scale (15 executors, 170 GB, 25 cores) New ORC vs Old ORC New ORC consistently outperforms old ORC in all queries significantly
  • 20.
    20 © HortonworksInc. 2011 – 2016. All Rights Reserved Results : 1 TB Scale (10 executors, 170 GB, 25 cores) * Total runtime of 97 queries in TPC-DS @ 1 TB scale • New ORC is ~2x faster than old ORC
  • 21.
    21 © HortonworksInc. 2011 – 2016. All Rights Reserved *Profiler shows high CPU usage with libzip.so during ORC reads. ORC-175 can be used for improving this further *ORC-175 (Intel ISAL: intelligent storage acceleration libraries for inflate) can be useful here
  • 22.
    22 © HortonworksInc. 2011 – 2016. All Rights Reserved Comparison of New ORC vs Old ORC vs Parquet (10 TB Scale) * Total runtime of 74 queries in TPC-DS * 10 TB Scale (15 executors, 170 GB, 25 cores)
  • 23.
    23 © HortonworksInc. 2011 – 2016. All Rights Reserved Agenda Performance Update Spark – Apache ORC 1.4 Integration Benchmark Overview Results Roadmap
  • 24.
    24 © HortonworksInc. 2011 – 2016. All Rights Reserved RoadMap  Tune default ambari configs for spark based on these benchmark results to get better OOTB experience for end users  Additional enhancements like parallel reading of footers in ORC  Include ORC-175 (Intel ISAL) when complete  Contribute back the fixes back to community
  • 25.
    25 © HortonworksInc. 2011 – 2016. All Rights Reserved Thanks!  Questions – For ORC – dev@@orc.apache.org – For Spark – dev@spark.apache.org – Benchmarks – dhyun@hortonworks.com
  • 26.
    26 © HortonworksInc. 2011 – 2016. All Rights Reserved Results : 10 TB Scale (15 executors, 170 GB, 25 cores) Comparison of New ORC vs Parquet