Big problems with big data – Hadoop interfaces security Jakub Kaluzny ZeroNights, Moscow, 2015 Плохо!
whoami Sr. IT Security Consultant at SecuRing • Consulting all phases of development • penetration tests • high-risk applications and systems Researcher • Hadoop, FOREX, MFP printers, proprietary network protocols
Agenda Big data nonsenses Crash course on hacking Hadoop installations Ways to protect big data environments Expect some CVEs
Results summary no account standard user admin user admin privileges data access
WHAT IS HADOOP? Know your target
Normal database Users Roles Data Model
Normal database architecture http://hackaday.com/2014/04/04/sql-injection-fools-speed-traps-and-clears-your-record/
Still normal database scenario CWE-xxx: SQL Injection through license plate http://hackaday.com/2014/04/04/sql-injection-fools-speed-traps-and-clears-your-record/ http://hococonnect.blogspot.com/2015/06/red-light-cameras-in-columbia.html http://8z4.net/images/ocr-technology
Normal database injection points
Normal database Users Roles Data Model Clear rules Clear target
user db, a lot of clients critical banking data, one supplier Anegdote Only one common table Q: Why don’t you split it into 2 dbs with a db link? A: Too much effort and we want to have fast statistics from all data.
What is Hadoop? https://www.flickr.com/photos/photonquantique/2596581870/ http://fiveprime.org/blackmagic.cgi?id=7007203773
Hadoop architecture schema
More on Hadoop
Hadoop injection points
Hadoop scenario https://en.wikipedia.org/wiki/Moneygami https://www.flickr.com/photos/mattimattila/8349565473 http://bigdataanalyticsnews.com/tag/hortonworks/
21 PB of storage in a single HDFS cluster 2000 machines 12 TB per machine (a few machines have 24 TB each) 1200 machines with 8 cores each + 800 machines with 16 cores each 32 GB of RAM per machine 15 map-reduce tasks per machine What is a lot of data?
Our latest assessment: • 32 machines, 8 cores each • 24TB per machine • 64 GB of RAM per machine • Almost 1 PB disk space and 2TB of RAM What is a lot of data? http://mrrobot.wikia.com/wiki/E_Corp
Attacker perspective https://plus.google.com/+Magiccardtrickszonetips
RISK ANALYSIS Know your threats
Who How What Risk analysis
Business perspective: competitor, script-kiddies, APT Technical perspective: Who? External attacker • Anonymous • Ex-employee Insider • Exployee (with some rights in Hadoop): user, admin • Infected machine, APT
Who How What Risk analysis
Full compromise
Data safety vs. data security
Q: What will be stored? A: „We do not know what data will be stored!” Typical bank scenario Bigdata analytic says: „People who bought a dashcam are more likely to take a loan for a new car in the next month” For what? All transaction data All sales data All client data http://thewondrous.com/julia-gunthel-worlds-most-flexible-secretary/ https://www.reddit.com/r/gifs/comments/37aara/calculations_intensify/
For what? Data theft
Other Privilege escalation • Authentication bypass Abuse • DoS • Data tampering
Who How What Risk analysis
How? https://en.wikipedia.org/wiki/Dowsing#Rods
WHAT HADOOP REALLY IS under sales-magic-cloud-big-data cover
Typical architecture http://thebigdatablog.weebly.com/blog/the-hadoop-ecosystem-overview
Apache Hue http://techbusinessintelligence.blogspot.com/2014/11/tableau-software-cloudera-hadoop.html
Hadoop injection points Differs much amongst distros
INTERFACES
Hadoop Distros specifics Admin ifaces external ifaces User ifaces Interfaces
OUR STORY WITH BIG DATA ASSESSMENT a.k.a. crash course on hacking big data environments
Hadoop Distros specifics Admin ifaces external ifaces User ifaces Interfaces
USER INTERFACES for employees and applications
User interfaces Hadoop Distros specifics Admin ifaces external ifaces User ifaces
User interfaces Apache Hue • Pig, Hive, Impala, Hbase, Zookeeper, Mahout, Oozie Other • Tez, Solr, Slider, Spark, Phoenix, Accummulo, Storm H D A E U
Is Hue an internal interface? H D A E U http://9gag.com/gag/awrwVL1/hue-hue-hue
Apache Hue overview H D A E U http://gethue.com/
Apache Hue DOM XSS var _anchor = $("a[name='" + decodeURIComponent(window.location.hash.subs tring(1)) + "']").last(); Payload: URL/help/#<img src="x" onerror="alert(1)"> H D A E U
Target old Hadoop installation (with Hue 2.6.1, Django 1.2.3) Target a user with access to Hue Send him XSS Get access to all Hadoop data designated for the user Apache Hue attack scenario
Default configurations sucks X-Frame-Options:ALLOWALL
ADMIN INTERFACES for admins and maintenance
Hadoop Distros specifics Admin ifaces external ifaces User ifaces Admin interfaces
Admin interfaces Apache Ambari • Provisioning, monitoring Apache Ranger • Security: authorization, authentication, auditing, data encryption, administration Other • Knox, Cloudbreak, Zookeeper, Falcon, Atlas, Sqoop, Flume, Kafka H D A E U
Apache Ambari Trochę o Ambari http://www.slideshare.net/hortonworks/ambari-using-a-local-repository?next_slideshow=1 H D A E U
Apache Ambari http://www.slideshare.net/hortonworks/ambari-using-a-local-repository?next_slideshow=1 H D A E U
Is Ambari an internal interface? H D A E U http://knowyourmeme.com/memes/facepalm
Apache Ambari • Standard users can sign into Ambari (WHY?) • Low hanging fruits: directory listing by default, no cookie flags, no CSRF protection • Interesting proxy script -> H D A E U
Apache Ambari REST API proxy Standard request: /proxy?url=http://XXXXXXXXX:8188/ws/v1/timel ine/HIVE_QUERY_ID?limit=1&secondaryFilter=te z:true&_=1424180016625 H D A E U Tampered request (logs accessible only from DMZ): /proxy?url=http://google.com /proxy?url=http://XXXXXXX:8088/logs /proxy?url=http://XXXXXXX:8088/logs/yarn- yarn-resourcemanager-XXXXXXX.log
Apache Ambari Server Side Request Forgery H D A E UCVE-2015-1775
Apache Ambari attack scenario Target old Hadoop installation with Ambari 1.5.0 to 2.0.2 Hijack standard account (or use Hue XSS to perform CSRF) Log into Ambari, use CVE-2015-1775 Get access to local network (DMZ) – HTTP only Download logs, exploit other Hadoop servers in DMZ H D A E U
Hadoop Distros specifics Admin ifaces external ifaces User ifaces Admin interfaces
Apache Ranger overview Previously: Apache Argus, XA-Secure Provides central administration for policies, users/groups, analytics and audit data. H D A E U http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_Sys_Admin_Guides/content/ref- 746ce51a-9bdc-4fef-85a6-69564089a8a6.1.html
Apache Ranger overview H D A E Uhttp://hortonworks.com/blog/best-practices-for-hive-authorization-using-apache-ranger-in-hdp-2-2/
• Low hanging fruits: no HTTP hardening, SlowHTTP DoS • Standard users can log into Ranger but have no permissions • Interesting function level access control -> Apache Ranger
Apache Ranger vulnerabilities H D A E U
Missing function level access control H D A E U CVE-2015-0266
Apache Ranger attack scenario Target an old Hadoop installation (Apache Ranger 0.4 or XA-Secure v. 3.5.001 ) Hijack standard Hadoop account Log into Ranger (with low permissions) Use CVE-2015-0266 to escalate privileges Edit accounts, authorization rules, access policies H D A E U
Apache Ranger vulnerabilities H D A E U https://cwiki.apache.org/confluence/display/RANGER/Apache+Ranger+0.5+-+User+Guide
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0) <script>alert(1);</script> Apache Ranger XSS through UserAgent H D A E UCVE-2015-0265
Apache Ranger attack scenario Target an old Hadoop installation (Apache Ranger 0.4 or XA- Secure v. 3.5.001 ) Network access to Apache Ranger is necessary (either from the internet or local network) Log in with any user and password using XSS in UserAgent You don’t need to escalate privileges, you’re already an admin (after admin opens session tab) Deploy BEEF or whatsoever (CSRF script) to create users and change policies H D A E U
• Affected version: Apache Ranger v 0.4.0, XA Secure v. 3.5.001 • Both vulnerabilities patched in Ranger v 0.5.0 • For a while developers did a self-full-disclosure -> Apache Ranger patched
RANGER-284 in public Jira now H D A E U
RANGER-284 shortly after vendor contact H D A E U
DISTRIBUTIONS SPECIFICS not in every environment
Hadoop Distros specifics Admin ifaces external ifaces User ifaces Distribution specifics
Distros H D A E Uhttp://blog.cloudera.com/blog/2012/07/the-hadoop-ecosystem-visualized-in-datameer/
cloud based hosted locally Basic distinction
Distros How long does it take to create a new distro version? How many components are outdated at that time? How long does it take to deploy a new distro at a company? How many components are outdated at that time? H D A E U Most cases: • MAJOR – ca. 1 year • MINOR – ca. 3 months • PATCH – ca. 1-2 months (differs much)
Hortonworks HDP components by version http://hortonworks.com/hdp/whats-new/
Distros Old components with known issues • Old OS components (java, php, ruby, etc.) • Old OS components (e.g. old tomcat used by Oozie and HDFS) • Old Hadoop components (e.g. old Hue, Ambari, Ranger) Default passwords Default configuration H D A E U
vuln found (e.g. Ambari) Hadoop pached distro update deployment Vulnerability timeline Responsible Disclosure? H D A E U vuln found (e.g. jQuery) jQuery patched Django patched Hue update distro update deployment Responsible disclosure? Full disclosure? Full disclosure?
Distros Old components with known issues Default passwords • SSH keys configured but default passwords still work • Default mysql passwords, NO mysql passwords Default configuration H D A E U
Distros Old components with known issues Default passwords Default configuration • No network level hardening • No HTTP hardening (clickjacking, session mgmt, errors) • Hue uses Django with DEBUG turned on by default • „Hacking virtual appliances” by Jeremy Brown H D A E U
Default configurations sucks X-Frame-Options:ALLOWALL H D A E U
EXTERNAL INTERFACES For clients or whatsoever
Hadoop Distros specifics Admin ifaces external ifaces User ifaces External interfaces
External • More than 25 internal Apache apps/modules • Vendor/distro specific apps/interfaces • Popular monitoring: Ganglia, Splunk • Auth providers: LDAP, Kerberos, OAuth • Many apps, many targets H D A E U
Hadoop Hadoop Distros specifics Admin ifaces external ifaces User ifaces
SUMMARY ways to protect your big data environment
Ways to protect your Hadoop environment Excessive network access • Keep it super tight! Excessive user pesmissions Typical web vulnerabilities Obsolete software Distros dependent vulnerabilities External system connections
Ways to protect your Hadoop environment Excessive network access Excessive user permissions • Map business roles to permissions Typical web vulnerabilities Obsolete software Distros dependent vulnerabilities External system connections
Ways to protect your Hadoop environment Excessive network access Excessive user permissions Typical web vulnerabilities • Pentest it! Introduce application independent security countermeasures Obsolete software Distros dependent vulnerabilities External system connections
Ways to protect your Hadoop environment Excessive network access Excessive user permissions Typical web vulnerabilities Obsolete software • Make a list of all components. Monitor bugtracks and CVEs. Distros dependent vulnerabilities External system connections
Ways to protect your Hadoop environment Excessive network access Excessive user permissions Typical web vulnerabilities Obsolete software Distros dependent vulnerabilities • A pentest after integration is a must. Demand security from software suppliers. External system connections
Ways to protect your Hadoop environment Excessive network access Excessive user permissions Typical web vulnerabilities Obsolete software Distros dependent vulnerabilities External system connections • Make a list of all external system connections. Do a threat modeling and pentest corresponding systems.
Thank you jakub.kaluzny@securing.pl MORE THAN SECURITY TESTING Contact me for additional materials @j_kaluzny

Big problems with big data – Hadoop interfaces security