Recipes For the Perfect PI - Simple Ingredients for Complex RequirementsSaschaWenninger - Australia Post
Bottom-Up ApproachPhoto by Gidzy
You’ll Hear About:BasisAdapter FrameworkInterface DesignQuestionsKey Points to Take Home
30,000 Foot ViewSAP POS DMSAP PISAP ERPRoutingMappingSAP SCEMXMLXMLXMLXMLMappingMappingJMS QueueMappingMapping……
Some Figures…Retail Transactions from 8,000+ terminals in 3,400 stores:500,000 – 750,000 messages per dayPeaks of 45/second into PI, then split by receiver6 receiving systems, 1-3 receivers per messageMessage size <10kBRoll-out: January – June 2011
ObjectivesNear-real time: <15 minutes end to endScalable to 2,000,000 messages/day and 14 receiver systemsFuture peaks of 90/second into PINo impact on other critical PI interfaces	e.g. Parcel Tracking interface moving 2m+ events per dayAs simple as possibleFuture-proof (no ccBPM, no ABAP mapping, etc.)
The Solution?Many small steps enabled us to get there.
Key TopicBasis
BasisSystem MemoryDatabase StuffJava Server NodesThe JVMJava Heap SpaceWily Introscope
System MemoryOne of the main foundations of the systemNeeds to accommodate all aspects of the system:Java + ABAP + ICM + DBMS + OS + caching/buffers + other appsNeed to avoid swapping to disk at all times – causes huge GC delays!
The DatabaseDB StatisticsWithout DB stats, our throughput dropped to 1/6th of the normal! Run daily in production, but be careful in non-prod systemsDB ReorgsPI DB tables churn very frequently and fragment after a few weeksTest your system/DBMS before and after a reorg to evaluate value Java Server NodesMore Java Server Nodes (Java processes) provide:Better system memory utilisationIncreased throughput via parallelisation…but also incur costs in cluster synchronisation. This can be significant for high volumes of EOIO interfaces with more than a few Java Server Nodes
The JVMPI 7.1 and later use SAP’s own JVMSAPJVM 4.1 for PI 7.0 coming soon (Notes 1495160, 1522198)Bug fixes and performance improvements delivered continuallySeveral JVM Upgrades over the past 12 months significantly reduced OOM crashes and Garbage Collection times for our systems.This enables the use of larger heap spaces!Patching via JSPM takes 30 minutes and 1 restart, plus some testing
Java Heap SpaceMore Heap = process larger messages, fewer full GCs, fewer OOMsSAP’s recommendation has historically been 2GB per Java process. Recently changed to 4GB (for SAP JVM)AusPost has been using 4GB for 9 months now. Main criteria: Full GCs should take less than 10 seconds.
Java Memory Analysis This is in SolMan!
Wily IntroscopeMonitoring for the Java StackBasic version is free with NetWeaver JavaProvides functionality normally accessed via ABAP transactions
Wily Dashboards
More Dashboards
And More…
Tons of Dashboards!…and this is only the free version!
Our PI SystemPI 7.11 SP5JVM 5.1 Patch level 59 (64 in test systems)3 Java server nodes, 4GB heap space each.AIX 6.1 on IBM POWER6, 60GB RAM
BasisSystem Memory
Database Stuff
Java Server Nodes
The JVM
Java Heap Space
Wily Introscope Many aspects which require special PI-Basis skills!
Key TopicAdapter Framework
Adapter FrameworkThread BehaviourMessaging System
ThreadsMessages are processed using a number of adapter-specific queues.Each queue has its own thread pool. e.g. for JMS:
ThreadsThreads are assigned to process messages from their respective poolsUse Wily Introscope to monitor utilisation and backlogsThreadsIn ‘traditional’ scenarios, both Sender and Receiver queues and thread pools are used. For Integrated Scenarios in the AAE, only the Sender-side queues and thread pools are used!Threads from the Sender pools do all the work!
CaveatSome Adapters process messages serially! e.g. JMS: 1 Adapter Thread per Communication Channel reads messages from the JMS queue{1 JMS Comm Channel3 Java Server Nodes
Threads in the AAEAdapter Thread picks up a message from the JMS queue, then:Executes Adapter ModulesPerforms Receiver DeterminationPersists 1 message per Receiver in Messaging System queue.Confirms message and removes from the JMS queueThe longer this takes, the lower throughput will be.
What does this mean?
Sender-Side Performance is Really Important
ImplicationsThroughput depends on Sender Adapter performanceTry to avoid adapter modules in Sender adapters. We achieved a 15% improvement by moving an XSLT mapping from the JMS adapter to the Operation Mapping.
ImplicationsThroughput depends on Receiver DeterminationThe initial “Adapter” thread evaluates conditions and copies the message for each Receiver. OptimiseXPath conditions
DB performance: more Receivers = more I/OImplicationsThroughput depends on parallelisationIncrease the Sender-side Thread Pools and Application Threads. We use 15 JMS Sender Threads and 350 Application Threads (also see Note 937159)
2 OptionsIn our system, each thread can process 10-15 messages/secIn order to consistently process 45 messages/sec from JMS, we need 4-5 concurrent JMS Sender Adapter threads. Option 1: 5 Java server nodesOption 2: Multiple JMS Communication Channels.
Multiple Comm. ChannelsWe cloned the Integrated Config Object and Sender Comm Channel2 JMS Sender Adapter Threads per Java Server node Better thread pool utilisationNo increased memory requirements
More scalableThis is not needed for SOAP since it’s truly parallel.Image Source: http://t.co/ZOBdlH0
Adapter FrameworkThread Behaviour
Messaging System Increased importance for Integrated Configuration scenarios!
Key TopicInterface Design
Integrated Configuration	Receiver Determination+	Interface Determination+	Receiver AgreementsIntegrated Configuration ObjectCauses an interface to be executed in the Advanced Adapter Engine
Integrated ConfigurationMessage Processing entirely in Java (AAE)Introduced in PI 7.1; more features in 7.11 and 7.3Improved performance by:Reducing database I/OEliminating ‘stack jumping’ between ABAP and Java7-10 times throughput possibleNot available for all adapters and scenarios.
Example
Example
ConstraintsThere are a few things you can’t do with Integrated Configuration:IDocs*Multi-Mappings*ccBPMsBut this is an Opportunity to be Creative!*features added in PI 7.3
Mapping ProgramsXSLTEasier to debug and tuneBetter performance from PI 7.1 onwardsEasier to supportConsolidate mapping stepsSame performance, but less GC!Reuse still possible via XSLT imports.
Communication ChannelsABAP Proxies/Enterprise Services Use SOAP Adapter in XI 3.0 modeJMS Senders Clone comm. Channels for greater parallelisationNon-SAP Systems Ensure inbound queuing is available to avoid backlog in PI
Design-Time GovernanceDon’t Overdo It!Find a balance which facilitates reuse but doesn’t impose too much up-front work.Image Source: http://geekandpoke.typepad.com

Lessons Learnt Implementing High-Performance Integration using SAP PI

  • 2.
    Recipes For thePerfect PI - Simple Ingredients for Complex RequirementsSaschaWenninger - Australia Post
  • 3.
  • 4.
    You’ll Hear About:BasisAdapterFrameworkInterface DesignQuestionsKey Points to Take Home
  • 5.
    30,000 Foot ViewSAPPOS DMSAP PISAP ERPRoutingMappingSAP SCEMXMLXMLXMLXMLMappingMappingJMS QueueMappingMapping……
  • 6.
    Some Figures…Retail Transactionsfrom 8,000+ terminals in 3,400 stores:500,000 – 750,000 messages per dayPeaks of 45/second into PI, then split by receiver6 receiving systems, 1-3 receivers per messageMessage size <10kBRoll-out: January – June 2011
  • 7.
    ObjectivesNear-real time: <15minutes end to endScalable to 2,000,000 messages/day and 14 receiver systemsFuture peaks of 90/second into PINo impact on other critical PI interfaces e.g. Parcel Tracking interface moving 2m+ events per dayAs simple as possibleFuture-proof (no ccBPM, no ABAP mapping, etc.)
  • 8.
    The Solution?Many smallsteps enabled us to get there.
  • 9.
  • 10.
    BasisSystem MemoryDatabase StuffJavaServer NodesThe JVMJava Heap SpaceWily Introscope
  • 11.
    System MemoryOne ofthe main foundations of the systemNeeds to accommodate all aspects of the system:Java + ABAP + ICM + DBMS + OS + caching/buffers + other appsNeed to avoid swapping to disk at all times – causes huge GC delays!
  • 12.
    The DatabaseDB StatisticsWithoutDB stats, our throughput dropped to 1/6th of the normal! Run daily in production, but be careful in non-prod systemsDB ReorgsPI DB tables churn very frequently and fragment after a few weeksTest your system/DBMS before and after a reorg to evaluate value Java Server NodesMore Java Server Nodes (Java processes) provide:Better system memory utilisationIncreased throughput via parallelisation…but also incur costs in cluster synchronisation. This can be significant for high volumes of EOIO interfaces with more than a few Java Server Nodes
  • 13.
    The JVMPI 7.1and later use SAP’s own JVMSAPJVM 4.1 for PI 7.0 coming soon (Notes 1495160, 1522198)Bug fixes and performance improvements delivered continuallySeveral JVM Upgrades over the past 12 months significantly reduced OOM crashes and Garbage Collection times for our systems.This enables the use of larger heap spaces!Patching via JSPM takes 30 minutes and 1 restart, plus some testing
  • 14.
    Java Heap SpaceMoreHeap = process larger messages, fewer full GCs, fewer OOMsSAP’s recommendation has historically been 2GB per Java process. Recently changed to 4GB (for SAP JVM)AusPost has been using 4GB for 9 months now. Main criteria: Full GCs should take less than 10 seconds.
  • 15.
    Java Memory AnalysisThis is in SolMan!
  • 16.
    Wily IntroscopeMonitoring forthe Java StackBasic version is free with NetWeaver JavaProvides functionality normally accessed via ABAP transactions
  • 17.
  • 18.
  • 19.
  • 20.
    Tons of Dashboards!…andthis is only the free version!
  • 21.
    Our PI SystemPI7.11 SP5JVM 5.1 Patch level 59 (64 in test systems)3 Java server nodes, 4GB heap space each.AIX 6.1 on IBM POWER6, 60GB RAM
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
    Wily Introscope Manyaspects which require special PI-Basis skills!
  • 28.
  • 29.
  • 30.
    ThreadsMessages are processedusing a number of adapter-specific queues.Each queue has its own thread pool. e.g. for JMS:
  • 31.
    ThreadsThreads are assignedto process messages from their respective poolsUse Wily Introscope to monitor utilisation and backlogsThreadsIn ‘traditional’ scenarios, both Sender and Receiver queues and thread pools are used. For Integrated Scenarios in the AAE, only the Sender-side queues and thread pools are used!Threads from the Sender pools do all the work!
  • 32.
    CaveatSome Adapters processmessages serially! e.g. JMS: 1 Adapter Thread per Communication Channel reads messages from the JMS queue{1 JMS Comm Channel3 Java Server Nodes
  • 33.
    Threads in theAAEAdapter Thread picks up a message from the JMS queue, then:Executes Adapter ModulesPerforms Receiver DeterminationPersists 1 message per Receiver in Messaging System queue.Confirms message and removes from the JMS queueThe longer this takes, the lower throughput will be.
  • 34.
  • 35.
  • 36.
    ImplicationsThroughput depends onSender Adapter performanceTry to avoid adapter modules in Sender adapters. We achieved a 15% improvement by moving an XSLT mapping from the JMS adapter to the Operation Mapping.
  • 37.
    ImplicationsThroughput depends onReceiver DeterminationThe initial “Adapter” thread evaluates conditions and copies the message for each Receiver. OptimiseXPath conditions
  • 38.
    DB performance: moreReceivers = more I/OImplicationsThroughput depends on parallelisationIncrease the Sender-side Thread Pools and Application Threads. We use 15 JMS Sender Threads and 350 Application Threads (also see Note 937159)
  • 39.
    2 OptionsIn oursystem, each thread can process 10-15 messages/secIn order to consistently process 45 messages/sec from JMS, we need 4-5 concurrent JMS Sender Adapter threads. Option 1: 5 Java server nodesOption 2: Multiple JMS Communication Channels.
  • 40.
    Multiple Comm. ChannelsWecloned the Integrated Config Object and Sender Comm Channel2 JMS Sender Adapter Threads per Java Server node Better thread pool utilisationNo increased memory requirements
  • 41.
    More scalableThis isnot needed for SOAP since it’s truly parallel.Image Source: http://t.co/ZOBdlH0
  • 42.
  • 43.
    Messaging System Increasedimportance for Integrated Configuration scenarios!
  • 44.
  • 45.
    Integrated Configuration Receiver Determination+ InterfaceDetermination+ Receiver AgreementsIntegrated Configuration ObjectCauses an interface to be executed in the Advanced Adapter Engine
  • 46.
    Integrated ConfigurationMessage Processingentirely in Java (AAE)Introduced in PI 7.1; more features in 7.11 and 7.3Improved performance by:Reducing database I/OEliminating ‘stack jumping’ between ABAP and Java7-10 times throughput possibleNot available for all adapters and scenarios.
  • 47.
  • 48.
  • 49.
    ConstraintsThere are afew things you can’t do with Integrated Configuration:IDocs*Multi-Mappings*ccBPMsBut this is an Opportunity to be Creative!*features added in PI 7.3
  • 50.
    Mapping ProgramsXSLTEasier todebug and tuneBetter performance from PI 7.1 onwardsEasier to supportConsolidate mapping stepsSame performance, but less GC!Reuse still possible via XSLT imports.
  • 51.
    Communication ChannelsABAP Proxies/EnterpriseServices Use SOAP Adapter in XI 3.0 modeJMS Senders Clone comm. Channels for greater parallelisationNon-SAP Systems Ensure inbound queuing is available to avoid backlog in PI
  • 52.
    Design-Time GovernanceDon’t OverdoIt!Find a balance which facilitates reuse but doesn’t impose too much up-front work.Image Source: http://geekandpoke.typepad.com
  • 53.
  • 54.
    Key Points toTake HomeWily IntroscopeEssential for understanding performance of the Java stackUse Integrated Configuration Objects20 minutes to migrate an existing interfaceKeep your JVM CurrentContinuous improvements without much fussMost of the other points can be quick wins too if you don’t do them all at once.
  • 55.
  • 56.
    Further ReadingPI PerformanceCheckSOA MiddlewareArticles on SDNIncl. the Performance section