APACHE SLING & FRIENDS TECH MEETUP BERLIN, 25-27 SEPTEMBER 2017 Building an Apache Sling Rendering Farm Bertrand Delacretaz @bdelacretaz Sling committer and PMC member
 Principal Scientist, Adobe AEM team slides revision 2017-09-25
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 2 What are we building?setting the stage
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 3 How is Sling used today? Load
 BalancingThe Web Publishing Instances Sling instances dedicated to single tenants or “friendly” tenants. Rendering + Caching Content
 Repository Rendering + Caching Content
 Repository Rendering + Caching Content
 Repository Rendering + Caching Content
 Repository Content
 Repository Authoring Content Distribution
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 4 A Massive Sling Rendering/Processing Farm? Content
 Repository Resource
 Resolution Scripting +
 Rendering Resource
 Resolution Resource
 Resolution Resource
 Resolution Scripting +
 Rendering Scripting +
 Rendering Scripting +
 Rendering Load
 Balancing Load
 Balancing Elastic scaling at each stage Multiple developers (“tenants”) see their own world only
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 5 Federated ServicesThis 2017 after all
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 6 Content
 Repository Resource
 Resolver Scripting and Rendering Script Resolver HTTP
 front-end Sling Engine Microservices! Nice and trendy, but will that perform? HTTP HTTP HTTP HTTP HTTP HTTP Each component is an independent HTTP-based service, aka “religious microservices”
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 7 The Sling PipelineFaithfully serving requests since 2007!
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 8 Content
 Repository Resource
 Resolver Scripting and Rendering Script Resolver Resource ScriptRequest Output Sling Request Processing Pipeline 1 2 3 4 Conceptually, the request hits the repository first, to get the Resource. Scripts and Servlets are equivalent, considering scripts only here. All in-memory and in-process! sling:include 5..N content aggregation!
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 9 Content
 Repository Scripting and Rendering Script Resolver Scripts Output Federated Services Friendly? Resource
 Resolver Content Aggregator Aggregated
 Content Request Process boundaries
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 10 Reasonably Federated?Can we get isolation AND performance?
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 11 HTTP
 front-end Reasonably Federated Sling Rendering Farm? Aggregated
 Content Content
 Repository Resource
 Resolver Content Aggregator Content Provider
 Service Content Rendering
 Service Scripting and Rendering Script Resolver Scripts
 Repository Output It’s still mostly Sling, with the addition of a (scripted?) content aggregation step. Federated services provide more deployment and scaling options. Sandboxed Execution Isolated Content Sandboxed Execution
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 12 Sandboxing & IsolationHow?
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 13 Sandboxing & Isolation? Scripting and Rendering Sandboxed Execution Content
 Repository Isolated Content Content Aggregator Sandboxed Execution Repository Access Control can work but require a dynamic search path in Sling, see our experiments. Impacts caching, and mapping of incoming to resource paths is needed.Tried and tested. Repository jails look possible with probable impact on Sling internals. Same with multiple SlingRepository services. New and more like a blacklist. Custom, restricted languages are the safest? HTL (Use-API?), Handlebars? Sandboxing Nashorn (JavaScript) looks possible but not ideal, see our experiments. Sandboxing Java is not realistic- IBM canceled multi tenant JVM project for example.
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 14 But it’s a VM, right? Oak Libraries Sling Engine JavaVirtual Machine content Oak Libraries Sling Engine JavaVirtual Machine content Oak Libraries Sling Engine JavaVirtual Machine content Perfect isolation! Java classes
 memory space Application
 memory space Java classes
 memory space Application
 memory space Java classes
 memory space Application
 memory space But suboptimal use of resources! (and containers wouldn’t help)
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 15 Sandboxing scripting languages? <% var length = 0; if (request.getRequestParameter("file") != null) { var file = null; // store file var reqPara = request.getRequestParameter("file"); var is = reqPara.getInputStream(); file = Packages.java.io.File.createTempFile("posttest", ".txt"); var fout = new Packages.java.io.FileOutputStream(file); var c; while ((c = is.read()) != -1) { fout.write(c); } fout.close(); // read length length = file.length(); } %> OS Resources Infinite Loops Java classes & services Memory
 Usage? Many things need to be limited.
 Whitelist approach is much safer -> custom languages?
 HTL inherently sandboxed, except its Use-objects
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 16 Containers? Oak Libraries Sling Engine OS container content Java classes
 memory space Application
 memory space SMALL! Shared Memory Pools, Caches etc. memory Oak Libraries Sling Engine OS container content Java classes
 memory space Application
 memory space SMALL! Oak Libraries Sling Engine OS container content Java classes
 memory space Application
 memory space SMALL! Same problem as multiple JVMs Sharing caches, compiled scripts etc. can be a pragmatic solution.
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 17 What do we do?
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 18 Hybrid Sling Rendering Farm Annotated Aggregated Content HTTP
 front-end HTTP
 routing Scripting + Rendering Script Resolver scripts Shared Services Custom Code Script Resolver scripts Tenant-Specific Services servletscontent Resource
 Resolver Content Aggregator Oak Libraries Shared Services Sandboxed Execution New
 Component Content-driven routing Isolated
 Content Dynamic Search Path Provides the flexibility of Sling via tenant-specific services and dynamic routing. Uses shared services for the common parts. Allows for billable options depending on the actual routing.
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 19 Experimentsbuilding blocks that might be reusable
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 20 Resolving new types of scripts Experim ent Script Resolver Client
 GET Request Wrapped
 AGG Request Wrap the request to make it appear as an AGG (*) request and pass that to the Sling ServletResolver. Adapt the return SlingScript to an InputStream to read its text. (*) or any other non-existent HTTP verb. Content
 Repository /apps
 /myapp /AGG.js AGG.js
 script
 text Code at https://github.com/bdelacretaz/sling-adaptto-2017 (ContentBVP.java)
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 21 Resolving a SLING-CONTENT script Experim ent Code at https://github.com/bdelacretaz/sling-adaptto-2017 (ContentBVP.java) String getAggregatorScript(SlingHttpServletRequest r) { String result = null; Servlet s = servletResolver.resolveServlet( new ChangeMethodRequestWrapper(r, "SLING-CONTENT")); if(s instanceof SlingScript) { InputStream is = ((SlingScript)s).getScriptResource() .adaptTo(InputStream.class); } if(is != null) { result = IOUtils.toString(is) } } return result; } adaptTo() Bonus Points!
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 22 Content Aggregation with Sling Query Experim entvar $ = Packages.org.apache.sling.query.SlingQuery.$ var SearchStrategy = Packages.org.apache.sling.query.api.SearchStrategy var resourceResolver = resource.getResourceResolver() var result = { siblings : $(resource).siblings(), rootChildren : $(resource).parents().last().children(), queryResult : $(resourceResolver) .searchStrategy(SearchStrategy.QUERY) .find("nt:base[title=foo]") } Used in a BindingsValuesProvider? Or in a custom json renderer servlet which runs this script. Inherently sandboxed due to custom language. https://sling.apache.org/documentation/bundles/sling-query.html
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 23 Dynamic scripts/servlet search path Experim entif(dynamicServletResolver.canResolve(resource)) { servlet = dynamicServletResolver.resolveServlet(request); } else { …existing resolver code } A fairly simple change to the SlingServletResolver - should evolve into a real extension point if desired, and probably get the request as well. Tested in SLING-4386 - another multitenant experiment which provides tenant-specific scripts but no real isolation. Currently requires disabling the servlet resolution cache.
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 24 Nashorn (JavaScript) sandboxing (Java Delight) Experim entNashornSandbox { allow(final Class<?> clazz); injectGlobalVariable(String variableName, Object object); setMaxCPUTime(long limitMsec); Object eval(final String javaScriptCode); allowPrintFunctions(boolean v); allowReadFunctions(boolean v); ...more allow functions // $ARG, $ENV, $EXEC... allowGlobalsObjects(final boolean v); } Uses Nashorn’s ClassFilter to block Java classes Sandboxing rewrites standard methods + user code- > blacklisting, not ideal https://github.com/javadelight/delight-nashorn-sandbox (Java Delight Suite)
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 25 CODAwhere to now?
Building an Apache Sling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 26 CODA Thank you for attending! I’m Bertrand Delacretaz (@bdelacretaz) In-memory nature of Sling is an important differentiator, in good and bad ways! Hybrid Rendering Farm promising - do you need it? Sandboxing is difficult, whitelisting much preferred, custom languages? Reusable
 experiments?

Building an Apache Sling Rendering Farm

  • 1.
    APACHE SLING &FRIENDS TECH MEETUP BERLIN, 25-27 SEPTEMBER 2017 Building an Apache Sling Rendering Farm Bertrand Delacretaz @bdelacretaz Sling committer and PMC member
 Principal Scientist, Adobe AEM team slides revision 2017-09-25
  • 2.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 2 What are we building?setting the stage
  • 3.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 3 How is Sling used today? Load
 BalancingThe Web Publishing Instances Sling instances dedicated to single tenants or “friendly” tenants. Rendering + Caching Content
 Repository Rendering + Caching Content
 Repository Rendering + Caching Content
 Repository Rendering + Caching Content
 Repository Content
 Repository Authoring Content Distribution
  • 4.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 4 A Massive Sling Rendering/Processing Farm? Content
 Repository Resource
 Resolution Scripting +
 Rendering Resource
 Resolution Resource
 Resolution Resource
 Resolution Scripting +
 Rendering Scripting +
 Rendering Scripting +
 Rendering Load
 Balancing Load
 Balancing Elastic scaling at each stage Multiple developers (“tenants”) see their own world only
  • 5.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 5 Federated ServicesThis 2017 after all
  • 6.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 6 Content
 Repository Resource
 Resolver Scripting and Rendering Script Resolver HTTP
 front-end Sling Engine Microservices! Nice and trendy, but will that perform? HTTP HTTP HTTP HTTP HTTP HTTP Each component is an independent HTTP-based service, aka “religious microservices”
  • 7.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 7 The Sling PipelineFaithfully serving requests since 2007!
  • 8.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 8 Content
 Repository Resource
 Resolver Scripting and Rendering Script Resolver Resource ScriptRequest Output Sling Request Processing Pipeline 1 2 3 4 Conceptually, the request hits the repository first, to get the Resource. Scripts and Servlets are equivalent, considering scripts only here. All in-memory and in-process! sling:include 5..N content aggregation!
  • 9.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 9 Content
 Repository Scripting and Rendering Script Resolver Scripts Output Federated Services Friendly? Resource
 Resolver Content Aggregator Aggregated
 Content Request Process boundaries
  • 10.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 10 Reasonably Federated?Can we get isolation AND performance?
  • 11.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 11 HTTP
 front-end Reasonably Federated Sling Rendering Farm? Aggregated
 Content Content
 Repository Resource
 Resolver Content Aggregator Content Provider
 Service Content Rendering
 Service Scripting and Rendering Script Resolver Scripts
 Repository Output It’s still mostly Sling, with the addition of a (scripted?) content aggregation step. Federated services provide more deployment and scaling options. Sandboxed Execution Isolated Content Sandboxed Execution
  • 12.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 12 Sandboxing & IsolationHow?
  • 13.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 13 Sandboxing & Isolation? Scripting and Rendering Sandboxed Execution Content
 Repository Isolated Content Content Aggregator Sandboxed Execution Repository Access Control can work but require a dynamic search path in Sling, see our experiments. Impacts caching, and mapping of incoming to resource paths is needed.Tried and tested. Repository jails look possible with probable impact on Sling internals. Same with multiple SlingRepository services. New and more like a blacklist. Custom, restricted languages are the safest? HTL (Use-API?), Handlebars? Sandboxing Nashorn (JavaScript) looks possible but not ideal, see our experiments. Sandboxing Java is not realistic- IBM canceled multi tenant JVM project for example.
  • 14.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 14 But it’s a VM, right? Oak Libraries Sling Engine JavaVirtual Machine content Oak Libraries Sling Engine JavaVirtual Machine content Oak Libraries Sling Engine JavaVirtual Machine content Perfect isolation! Java classes
 memory space Application
 memory space Java classes
 memory space Application
 memory space Java classes
 memory space Application
 memory space But suboptimal use of resources! (and containers wouldn’t help)
  • 15.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 15 Sandboxing scripting languages? <% var length = 0; if (request.getRequestParameter("file") != null) { var file = null; // store file var reqPara = request.getRequestParameter("file"); var is = reqPara.getInputStream(); file = Packages.java.io.File.createTempFile("posttest", ".txt"); var fout = new Packages.java.io.FileOutputStream(file); var c; while ((c = is.read()) != -1) { fout.write(c); } fout.close(); // read length length = file.length(); } %> OS Resources Infinite Loops Java classes & services Memory
 Usage? Many things need to be limited.
 Whitelist approach is much safer -> custom languages?
 HTL inherently sandboxed, except its Use-objects
  • 16.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 16 Containers? Oak Libraries Sling Engine OS container content Java classes
 memory space Application
 memory space SMALL! Shared Memory Pools, Caches etc. memory Oak Libraries Sling Engine OS container content Java classes
 memory space Application
 memory space SMALL! Oak Libraries Sling Engine OS container content Java classes
 memory space Application
 memory space SMALL! Same problem as multiple JVMs Sharing caches, compiled scripts etc. can be a pragmatic solution.
  • 17.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 17 What do we do?
  • 18.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 18 Hybrid Sling Rendering Farm Annotated Aggregated Content HTTP
 front-end HTTP
 routing Scripting + Rendering Script Resolver scripts Shared Services Custom Code Script Resolver scripts Tenant-Specific Services servletscontent Resource
 Resolver Content Aggregator Oak Libraries Shared Services Sandboxed Execution New
 Component Content-driven routing Isolated
 Content Dynamic Search Path Provides the flexibility of Sling via tenant-specific services and dynamic routing. Uses shared services for the common parts. Allows for billable options depending on the actual routing.
  • 19.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 19 Experimentsbuilding blocks that might be reusable
  • 20.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 20 Resolving new types of scripts Experim ent Script Resolver Client
 GET Request Wrapped
 AGG Request Wrap the request to make it appear as an AGG (*) request and pass that to the Sling ServletResolver. Adapt the return SlingScript to an InputStream to read its text. (*) or any other non-existent HTTP verb. Content
 Repository /apps
 /myapp /AGG.js AGG.js
 script
 text Code at https://github.com/bdelacretaz/sling-adaptto-2017 (ContentBVP.java)
  • 21.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 21 Resolving a SLING-CONTENT script Experim ent Code at https://github.com/bdelacretaz/sling-adaptto-2017 (ContentBVP.java) String getAggregatorScript(SlingHttpServletRequest r) { String result = null; Servlet s = servletResolver.resolveServlet( new ChangeMethodRequestWrapper(r, "SLING-CONTENT")); if(s instanceof SlingScript) { InputStream is = ((SlingScript)s).getScriptResource() .adaptTo(InputStream.class); } if(is != null) { result = IOUtils.toString(is) } } return result; } adaptTo() Bonus Points!
  • 22.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 22 Content Aggregation with Sling Query Experim entvar $ = Packages.org.apache.sling.query.SlingQuery.$ var SearchStrategy = Packages.org.apache.sling.query.api.SearchStrategy var resourceResolver = resource.getResourceResolver() var result = { siblings : $(resource).siblings(), rootChildren : $(resource).parents().last().children(), queryResult : $(resourceResolver) .searchStrategy(SearchStrategy.QUERY) .find("nt:base[title=foo]") } Used in a BindingsValuesProvider? Or in a custom json renderer servlet which runs this script. Inherently sandboxed due to custom language. https://sling.apache.org/documentation/bundles/sling-query.html
  • 23.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 23 Dynamic scripts/servlet search path Experim entif(dynamicServletResolver.canResolve(resource)) { servlet = dynamicServletResolver.resolveServlet(request); } else { …existing resolver code } A fairly simple change to the SlingServletResolver - should evolve into a real extension point if desired, and probably get the request as well. Tested in SLING-4386 - another multitenant experiment which provides tenant-specific scripts but no real isolation. Currently requires disabling the servlet resolution cache.
  • 24.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 24 Nashorn (JavaScript) sandboxing (Java Delight) Experim entNashornSandbox { allow(final Class<?> clazz); injectGlobalVariable(String variableName, Object object); setMaxCPUTime(long limitMsec); Object eval(final String javaScriptCode); allowPrintFunctions(boolean v); allowReadFunctions(boolean v); ...more allow functions // $ARG, $ENV, $EXEC... allowGlobalsObjects(final boolean v); } Uses Nashorn’s ClassFilter to block Java classes Sandboxing rewrites standard methods + user code- > blacklisting, not ideal https://github.com/javadelight/delight-nashorn-sandbox (Java Delight Suite)
  • 25.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 25 CODAwhere to now?
  • 26.
    Building an ApacheSling Rendering Farm - Bertrand Delacretaz, adaptTo 2017 26 CODA Thank you for attending! I’m Bertrand Delacretaz (@bdelacretaz) In-memory nature of Sling is an important differentiator, in good and bad ways! Hybrid Rendering Farm promising - do you need it? Sandboxing is difficult, whitelisting much preferred, custom languages? Reusable
 experiments?