Sandboxing JS and HTML. A lession Learned

Sandboxing HTML pages with API wrappers as a countermeasure to malicious 3rd party code attacks. A lesson learned. Stefano Di Paola, CTO & Chief Scientist Minded Security

$ Whoami  Security Since 1999  OWASP Italy Director of Research  Server Side (HPP, Expression Language Injection...)  Client Side (UXSS, SWFIntruder, DOMinator)  Work @ Minded Security  CTO & Chief Scientist

Agenda  Intro  The Problem  Attempts  Drawbacks  Conclusion

Browser Features History Same Origin Policy 1995 XMLHttpRequest 2002

Subverting SOP – Traditional way Solution is easy Encode ALL dangerous inputs to HTML Entities, right? Not quite. <html>.. <script>evilJs</script> ..</html> taintedInput=<script>evilJs</script>

The Problem  Browser SOP  Industry needs to:  Safely Allow external 3rd Party code (Advertising et al)  Let users customize pages (Facebook et al.)

Industry Needs - Server Side Filtering  Social portals want their users to be free to customize their home page.  SOP is too loose.  Solution?  Create a server side filters that allow only a HTML subset It's 2005 MySpace is the most used Social Site 40 Millions of unique Visitors

Server Side HTML Filtering - MySpace  MySpace Approach:  Whitelisted Tags/Attributes  only img,embed,div  (blocked <script>,on* etc )  Style Allowed? Yes  Word blacklisting:  E.g.: Javascript

MySpace Worm – Samy is my hero  Samy bypasses:  JavaScript: stripped from EVERYWHERE. Replaced with '..' http://namb.la/popular/tech.html

Server Side HTML Filtering - OWASP Anti Samy  Attempts to solve the MySpace approach on the server side.  Author: Arshan Dabirsiaghi  Translates HTML to Well Formed XML  Whitelist of Tags and Attributes  Everything else is encoded  One bypass so far (usual problem) – Fixed: <![CDATA[ .. ]]> ← AntiSamy Expects this, but ... <![CDATA[ .. ]> ← works on every browser. https://www.owasp.org/index.php/Category:OWASP_AntiSamy_Project

Server Side JS/HTML Filter - Caja  ~2008  Author: Google  The 3rd party code problem Aka Sandboxed Advertising [...The Caja Compiler is a tool for making third party HTML, CSS and JavaScript safe to embed in your website. ..] https://developers.google.com/caja/docs/about/

Google Attempt – Server Side Filter Takes JS/ES5 strict mode, HTML, and CSS input and rewrites it into: * safe subset of HTML & CSS * JavaScript function with no free variables. (Code Rewrite)

Google Caja – Some Bypass  2009 Issues:  Arbitrary code execution via DOM wrappers.  Flaw in JSON parsing.  JavaScript URLs in style attributes not sanitized (Samy anyone?).  2013:  JavaScript parsers differ on whether they interpret escaped sequences of letters spelling a reserved word, such as "deu006Cete", as an identifier or a reserved word. https://github.com/google/caja/wiki/SecurityAdvisories

Server Side Filtering – Lession Learned  Negative security models are error prone  Different Browsers behave differently:  Hard to make general assumption  Browsers do not always strictly implement RFC and siblings  Filtering something that is going to be parsed by a different parser is hard (Models Impedance Mismatch)

Client side HTML Filtering - HTML Data Bindings  2007 HTML Data Binding – Author: Stefano Di Paola  Tries to overcome the server side problem  HTML Sanitizer using JS + SQL Prepared Statements approach  and uses the browser native parser. http://www.wisec.it/sectou.php?id=46c5843ea4900 // URL Native Parsing var a = document.createElement('A'); return a.protocol var el=document.getElementById('someid') var val = document.createTextNode(binding['id']) el.appendChild( val );

HTML Data Bindings - Bypass  Variable Width Charset could mess up every tag <plaintext> tag! Leading to bypass the binding area (Half-Fixed)  (Fixed...sort of...)  Had to face browser complexity + some wrong assumption!  Anyway, it was just a proof of concept.. :) http://www.wisec.it/ph/test.php

Client Side JS Filtering - JSReg.*  ~2010  Author: Gareth Heyes  JS Sandbox. Uses a JS Tokenizer based on complex  RegExp.  Code: https://code.google.com/p/jsreg/source/browse/trunk/JSReg/JSReg.js

Client Side Filtering - JSReg.*  Code Rewriting Approach + Sandboxed Checks http://www.businessinfo.co.uk/labs/jsreg/jsreg.html https://code.google.com/p/jsreg/

JSReg.* – Bypasses 2010-2011 (/[/]/)[/(/))/+alert(top)+"/"/i] .. first is the failure to strip the single line comment which then fools the regex rule into thinking that the code is a regex object and not function calls .. http://marc.info/?l=websecurity&m=126855547523766 http://www.thespanner.co.uk/2010/10/31/jsreg-bypasses/

Client Side JS Filtering - MentalJS  2011  Author: Gareth Heyes  ES5 JS Sandbox Parser based No RegExp this time  Rewrites the JS code. https://github.com/hackvertor/MentalJS

MentalJS - Bypasses  2014-2015  whitelisted attribute innerHTML for Script  insertBefore with null/undefined, bypasses unexpected Browser behavior.  Other very interesting bypasses: x=document.createElement('script'); x.innerHTML='alert(location)'; document.body.appendChild(x); s=document.createElement('script'); s.insertBefore(document.createTextNode('alert(location)'),null); document.body.appendChild(s); http://www.thespanner.co.uk/2015/05/03/how-i-smashed-mentaljs/

Client Side JS Filtering – Evel (Secure Eval)  2013  Author: Nathan Vander Wilt  JS Sandbox using Browser Parser and Environment redefinition (runtime sandbox). https://github.com/natevw/evel/

Evel - Bypasses 2013  All of them defeated it by trying to access the window object.  If one reach Function() this object is always the window.  Function is the constructor of all functions. http://perfectionkills.com/global-eval-what-are-the-options/#indirect_eval_call_theory

Client Side HTML Filtering - DOMPurify  2014  Author: Mario Heiderich  Uses internal browser parser to create a DOM model and then sanitize untrusted HTML: https://github.com/cure53/DOMPurify

Client Side HTML Filtering - DOMPurify  Sanitize over a whitelist of tags and attributes:  Uses JS to access the DOM:  As seen in Data Binding there are several browsers subtelties. https://github.com/cure53/DOMPurify

DOMPurify – Bypasses  DOM Clobbering checks bypass:  Attack: oroush.secproject.com/blog/2014/04/how-did-i-bypass-everything-in-modsecurity-evasion-c http://www.thespanner.co.uk/2013/05/16/dom-clobbering/

AngularJS - Sandbox  2013 -  Author: Google

AngularJS – Bypasses  Blacklisted Functions call prevention bypass: https://code.google.com/p/mustache-security/wiki/AngularJS

JS Sandbox Approach - Attacks  Fool the parser into a wrong state (Impedance Mismatch) (different parsers) – code rewrite  Fool the sandbox via unexpected Browser behavior. Eg. DOM Clobbering et al.  Access unsanitized members (constructor, prototype etc)

HTML Sandbox Approach - Attacks  Fool the parser into a wrong state (Impedance Mismatch) (different parsers) – different parser  Fool the native parser (bad assumptions) as browsers 5+ parsers.  Eg: is createHTMLDocument implemented as expected?  Fool the sandbox via unexpected Browser DOM behavior.  Eg: are attribute names and values correctly normalized?  Access unsanitized members (constructor, prototype etc)

Client Side – Lession Learned  It may really be the right direction but:  JS: Code Rewrite it's hard in the JS context as it requires a separate parser.  HTML: Using browsers parsers allows to automatically identify, with no particular effort, the right context , at the right time, with the right charset  HTML: Browser still have 5+ (!!) parsers (HTML, URL,CSS,JS,SVG,...). Need to apply them at the right time!  HTML + JS: Intricate echosystem (DOM Clobbering)

Disclaimer  The bypasses of the sandboxes where all fixed  Not all sandboxes are still maintained, but most of them are.  It's still unproven that the proposed solutions are completely safe  ..but without breakers feedback, no bypass would have been found.( The world needs good brains to break things – and the fix them!)  Authors are brave and smart people trying to solve a complex problem with passion and reasoning

Conclusions  The filtering approach to sanitize/sandbox untrusted code is a hard problem  Several attempts during the years have been made  Using a different layer to sanitize code that'll be interpreted in complex environment is usually a bad idea. (Eg. Server Side → Client Side)  A fully functional, unbreakable solution is yet to be released  Browser vendors and Sandbox builders should join together to solve the problem

Questions? Stefano Di Paola Mail: Stefano.dipaola@mindedsecurity.com Twitter: @WisecWisec Blog: blog.mindedsecurity.com Site: www.mindedsecurity.com Thanks!

Sandboxing JS and HTML. A lession Learned

More Related Content

What's hot

Similar to Sandboxing JS and HTML. A lession Learned

More from Minded Security

Recently uploaded

Sandboxing JS and HTML. A lession Learned