Java EE 7 Batch Processing in the Real World Roberto Cortez & Ivan St. Ivanov JavaOne 2014 #CON2818
Who are we? Roberto Cortez @radcortez h/p://www.radcortez.com Freelancer, Speaker, RebelLabs Author, Blogger Ivan St. Ivanov @ivan_stefanov h/p://nosoGskills.com Architect, SAP Labs Bulgaria, JBoss Forge contributor
QuesHons? As soon as you have them!
What is Batch? “A group of records processed as a single unit, usually without input from a user.” (from Google)
Why Batch? (Computers) • Make use of idle resources • ShiT the Hme of processing • Manage large repeated work easily • Shared by mulHple users
Batch is in our lives • By paying bills • On the grocery • In traffic • Cooking food
Why Batch? (General) • Efficiency • Focus on a single problem • Avoid context switch • Reduce costs
State of the Art • Spring Batch • IBM Websphere • Hadoop • Java SE, Scheduler, in-­‐house frameworks
The JSR-­‐352 • Batch ApplicaHons for the Java pla]orm • Heavily inspired by Spring Batch • Available since Java EE 7 • Also designed for Java SE
The JSR-­‐352 Features • Task and Chunk oriented processing • SequenHal and/or Parallel execuHon • CheckpoinHng • Workflow • Stop and restart • ExcepHon handling
Batch Domain Language Job Operator Job Step Job Repository Item Reader Item Processor Item Writer 1 *
Job composiYon * Job Step * JobInstance * * * JobExecuYon StepExecuYon
Java EE Batch API
Batchlet DefiniYon @Named public class MyBatchlet extends AbstractBatchlet { @Override public String process() { System.out.println("Running inside a batchlet"); return BatchStatus.COMPLETED.toString(); } }
Job SpecificaYon Language <job id="myFirstBatch"> <step id="myStep" > <batchlet ref="myBatchlet"/> </step> </job>
Start the Job JobOperator jobOperator = BatchRunYme.getJobOperator(); Long execuYonId = jobOperator.start("myJob", new ProperYes()); JobExecuYon jobExecuYon = jobOperator.getJobExecuYon(execuYonId);
Batchlet • Task oriented Batch Step • Suitable for • Short execuHons • Handle system resources • IniHalizaHon and Cleanup
Chunk • ETL style (Reader, Processor, Writer) • Suitable for • Handle large amounts of data • Failover safety (checkpoint) • Long running applicaHons
Chunk Processing Step ItemReader ItemProcessor ItemWriter read() read() process(item) process(item) write(items) item item item item execute() ExitStatus
Defining Chunk Step <step id="myChunk"> <chunk checkpoint-­‐policy="item" item-­‐count="3"> <reader ref=“myItemReader"/> <processor ref=“myItemProcessor"/> <writer ref=“myItemWriter"/> </chunk> </step>
Item Reader @Named public class MyItemReader extends AbstractItemReader { public Object readItem() throws ExcepYon {…} }
Item Processor @Named public class MyItemProcessor implements ItemProcessor { public Object processItem(Object item) throws ExcepYon {…} }
Item Writer @Named public class MyItemWriter extends AbstractItemWriter { public void writeItems(List<Object> items) throws ExcepYon {…} }
ExcepYon Handling • Job ExecuHon Fails on ExcepHon • Possible to Skip or Retry ExcepHons • Applied to the Chunk • Include or Exclude ExcepHons
ExcepYon Handling DefiniYon <chunk checkpoint-­‐policy="item” item-­‐count="3" skip-­‐limit="3" retry-­‐limit="3"> <reader ref="myItemReader"/> <processor ref="myItemProcessor"/> <writer ref="myItemWriter"/> <skippable-­‐excepYon-­‐classes> <include class="java.lang.RunYmeExcepYon"/> </skippable-­‐excepYon-­‐classes> <retryable-­‐excepYon-­‐classes> <exclude class="java.lang.IllegalArgumentExcepYon"/> </retryable-­‐excepYon-­‐classes> </chunk>
ParYYon • Steps run on mulHple Threads • One ParHHon per Thread • Define ParHHon Plan • CheckpoinHng is independent per Thread
ParYYon DefiniYon (In Step) <parYYon> <plan parYYons="2"> <properYes parYYon="0"> <property name="start" value="1"/> </properYes> <properYes parYYon="1"> <property name="start" value="11"/> </properYes> </plan> </parYYon>
ParYYon DefiniYon (In Step) <chunk item-­‐count="3"> <reader ref="myItemReader"> <properYes> <property name="start” value="#{parYYonPlan['start']}" /> </properYes> </reader> <processor ref="myItemProcessor"/> <writer ref="myItemWriter"/> </chunk>
Flow • Sequence of ExecuHon elements • Execute together as a unit • Flows, Steps, Decision and Split • Used to aggregate logical business.
Split • Concurrent ExecuHon of Flows • Each Flow run on a separate Thread • Is not the same as a ParHHon • Can only use Flows
Split and Flow DefiniYon <split id="mySplit"> <flow id="flow1"> <step id="myChunk" next="myBatchlet”>…</step> <step id="myBatchlet">...</step> </flow> <flow id="flow2"> <step id=”otherChunk" next=”otherBatchlet”>…</step> <step id=”otherBatchlet">...</step> </flow> </split>
ParYYon vs Split • ParHHon is used at the data level • Split is used at the business level
Decision • Decide the next TransiHon • Applied to Steps, Flows and Splits • It’s almost like an if / else if • Requires an implementaHon
Decision DefiniYon <step id=”myStep" next="myDecider"> <batchlet ref="myBatchlet”/> </step> <decision id="myDecider" ref="MyDecider"> <next on="foo" to="myFooStep"/> <next on="bar" to="myBarStep"/> </decision>
Decision @Named public class MyDecider implements Decider { public String decide(StepExecuYon[] execuYons) throws ExcepYon {…} }
TransiYons • TransiHon Elements define the Workflow • Applied to Steps, Flows and Decision • Next, Fail, End, Stop • Fail, End and Stop terminate a Job
Batch Exit Status • A Job and Steps has an Exit Status • Used to control the Workflow • STARTING, STARTED, STOPPING, STOPPED, FAILED, COMPLETED, ABANDONED
Schedule a Job (By @Schedule) @Singleton public class BatchJobRunner { @Schedule(dayOfWeek = "Sun") public void scheduleJob() { JobOperator jobOperator = BatchRunYme.getJobOperator(); jobOperator.start("myJob", new ProperYes()); } }
Schedule a Job (By ManagedScheduledExecutorService) @Resource(lookup="java:comp/DefaultManagedScheduledExecutorService") private ManagedScheduledExecutorService executor; public void scheduleJob() { JobOperator jobOperator = BatchRunYme.getJobOperator(); executor.schedule( () -­‐> jobOperator.start("myJob", new ProperYes()), 7, TimeUnit.DAYS); }
ImplementaYons • JBatch IBM (Glassfish, JEUS) • JBeret (Wildfly) • Spring Batch
A Few Cons • Lacks standard Readers / Writers • No Generics • ParHHon only supports a single Step • No Sync mode
Demo Time
WoW AucYon House • World of WacraT is MMORPG • More than 500 servers in US and EU • Each server has 2 AucHon Houses • Trades around 70k items / server / hour
WoW AucYon House • Data available to download • Let’s process the data • Extract metrics • Share the knowledge
Live Code
Resources • JSR-­‐352 SpecificaHon hips://jcp.org/en/jsr/detail?id=352 • Java EE Samples hips://github.com/javaee-­‐samples • Wow AucHon House hips://github.com/radcortez/wow-­‐aucHons
Thank you for A/ending! Roberto Cortez @radcortez h/p://www.radcortez.com Ivan St. Ivanov @ivan_stefanov h/p://nosoGskills.com

Java EE 7 Batch processing in the Real World

  • 1.
    Java EE 7 Batch Processing in the Real World Roberto Cortez & Ivan St. Ivanov JavaOne 2014 #CON2818
  • 2.
    Who are we? Roberto Cortez @radcortez h/p://www.radcortez.com Freelancer, Speaker, RebelLabs Author, Blogger Ivan St. Ivanov @ivan_stefanov h/p://nosoGskills.com Architect, SAP Labs Bulgaria, JBoss Forge contributor
  • 3.
    QuesHons? As soon as you have them!
  • 4.
    What is Batch? “A group of records processed as a single unit, usually without input from a user.” (from Google)
  • 5.
    Why Batch? (Computers) • Make use of idle resources • ShiT the Hme of processing • Manage large repeated work easily • Shared by mulHple users
  • 6.
    Batch is in our lives • By paying bills • On the grocery • In traffic • Cooking food
  • 7.
    Why Batch? (General) • Efficiency • Focus on a single problem • Avoid context switch • Reduce costs
  • 8.
    State of the Art • Spring Batch • IBM Websphere • Hadoop • Java SE, Scheduler, in-­‐house frameworks
  • 9.
    The JSR-­‐352 •Batch ApplicaHons for the Java pla]orm • Heavily inspired by Spring Batch • Available since Java EE 7 • Also designed for Java SE
  • 10.
    The JSR-­‐352 Features • Task and Chunk oriented processing • SequenHal and/or Parallel execuHon • CheckpoinHng • Workflow • Stop and restart • ExcepHon handling
  • 11.
    Batch Domain Language Job Operator Job Step Job Repository Item Reader Item Processor Item Writer 1 *
  • 12.
    Job composiYon * Job Step * JobInstance * * * JobExecuYon StepExecuYon
  • 13.
  • 14.
    Batchlet DefiniYon @Named public class MyBatchlet extends AbstractBatchlet { @Override public String process() { System.out.println("Running inside a batchlet"); return BatchStatus.COMPLETED.toString(); } }
  • 15.
    Job SpecificaYon Language <job id="myFirstBatch"> <step id="myStep" > <batchlet ref="myBatchlet"/> </step> </job>
  • 16.
    Start the Job JobOperator jobOperator = BatchRunYme.getJobOperator(); Long execuYonId = jobOperator.start("myJob", new ProperYes()); JobExecuYon jobExecuYon = jobOperator.getJobExecuYon(execuYonId);
  • 17.
    Batchlet • Task oriented Batch Step • Suitable for • Short execuHons • Handle system resources • IniHalizaHon and Cleanup
  • 18.
    Chunk • ETL style (Reader, Processor, Writer) • Suitable for • Handle large amounts of data • Failover safety (checkpoint) • Long running applicaHons
  • 19.
    Chunk Processing Step ItemReader ItemProcessor ItemWriter read() read() process(item) process(item) write(items) item item item item execute() ExitStatus
  • 20.
    Defining Chunk Step <step id="myChunk"> <chunk checkpoint-­‐policy="item" item-­‐count="3"> <reader ref=“myItemReader"/> <processor ref=“myItemProcessor"/> <writer ref=“myItemWriter"/> </chunk> </step>
  • 21.
    Item Reader @Named public class MyItemReader extends AbstractItemReader { public Object readItem() throws ExcepYon {…} }
  • 22.
    Item Processor @Named public class MyItemProcessor implements ItemProcessor { public Object processItem(Object item) throws ExcepYon {…} }
  • 23.
    Item Writer @Named public class MyItemWriter extends AbstractItemWriter { public void writeItems(List<Object> items) throws ExcepYon {…} }
  • 24.
    ExcepYon Handling •Job ExecuHon Fails on ExcepHon • Possible to Skip or Retry ExcepHons • Applied to the Chunk • Include or Exclude ExcepHons
  • 25.
    ExcepYon Handling DefiniYon <chunk checkpoint-­‐policy="item” item-­‐count="3" skip-­‐limit="3" retry-­‐limit="3"> <reader ref="myItemReader"/> <processor ref="myItemProcessor"/> <writer ref="myItemWriter"/> <skippable-­‐excepYon-­‐classes> <include class="java.lang.RunYmeExcepYon"/> </skippable-­‐excepYon-­‐classes> <retryable-­‐excepYon-­‐classes> <exclude class="java.lang.IllegalArgumentExcepYon"/> </retryable-­‐excepYon-­‐classes> </chunk>
  • 26.
    ParYYon • Steps run on mulHple Threads • One ParHHon per Thread • Define ParHHon Plan • CheckpoinHng is independent per Thread
  • 27.
    ParYYon DefiniYon (In Step) <parYYon> <plan parYYons="2"> <properYes parYYon="0"> <property name="start" value="1"/> </properYes> <properYes parYYon="1"> <property name="start" value="11"/> </properYes> </plan> </parYYon>
  • 28.
    ParYYon DefiniYon (In Step) <chunk item-­‐count="3"> <reader ref="myItemReader"> <properYes> <property name="start” value="#{parYYonPlan['start']}" /> </properYes> </reader> <processor ref="myItemProcessor"/> <writer ref="myItemWriter"/> </chunk>
  • 29.
    Flow • Sequence of ExecuHon elements • Execute together as a unit • Flows, Steps, Decision and Split • Used to aggregate logical business.
  • 30.
    Split • Concurrent ExecuHon of Flows • Each Flow run on a separate Thread • Is not the same as a ParHHon • Can only use Flows
  • 31.
    Split and Flow DefiniYon <split id="mySplit"> <flow id="flow1"> <step id="myChunk" next="myBatchlet”>…</step> <step id="myBatchlet">...</step> </flow> <flow id="flow2"> <step id=”otherChunk" next=”otherBatchlet”>…</step> <step id=”otherBatchlet">...</step> </flow> </split>
  • 32.
    ParYYon vs Split • ParHHon is used at the data level • Split is used at the business level
  • 33.
    Decision • Decide the next TransiHon • Applied to Steps, Flows and Splits • It’s almost like an if / else if • Requires an implementaHon
  • 34.
    Decision DefiniYon <step id=”myStep" next="myDecider"> <batchlet ref="myBatchlet”/> </step> <decision id="myDecider" ref="MyDecider"> <next on="foo" to="myFooStep"/> <next on="bar" to="myBarStep"/> </decision>
  • 35.
    Decision @Named public class MyDecider implements Decider { public String decide(StepExecuYon[] execuYons) throws ExcepYon {…} }
  • 36.
    TransiYons • TransiHon Elements define the Workflow • Applied to Steps, Flows and Decision • Next, Fail, End, Stop • Fail, End and Stop terminate a Job
  • 37.
    Batch Exit Status • A Job and Steps has an Exit Status • Used to control the Workflow • STARTING, STARTED, STOPPING, STOPPED, FAILED, COMPLETED, ABANDONED
  • 38.
    Schedule a Job (By @Schedule) @Singleton public class BatchJobRunner { @Schedule(dayOfWeek = "Sun") public void scheduleJob() { JobOperator jobOperator = BatchRunYme.getJobOperator(); jobOperator.start("myJob", new ProperYes()); } }
  • 39.
    Schedule a Job (By ManagedScheduledExecutorService) @Resource(lookup="java:comp/DefaultManagedScheduledExecutorService") private ManagedScheduledExecutorService executor; public void scheduleJob() { JobOperator jobOperator = BatchRunYme.getJobOperator(); executor.schedule( () -­‐> jobOperator.start("myJob", new ProperYes()), 7, TimeUnit.DAYS); }
  • 40.
    ImplementaYons • JBatch IBM (Glassfish, JEUS) • JBeret (Wildfly) • Spring Batch
  • 41.
    A Few Cons • Lacks standard Readers / Writers • No Generics • ParHHon only supports a single Step • No Sync mode
  • 42.
  • 43.
    WoW AucYon House • World of WacraT is MMORPG • More than 500 servers in US and EU • Each server has 2 AucHon Houses • Trades around 70k items / server / hour
  • 44.
    WoW AucYon House • Data available to download • Let’s process the data • Extract metrics • Share the knowledge
  • 45.
  • 46.
    Resources • JSR-­‐352 SpecificaHon hips://jcp.org/en/jsr/detail?id=352 • Java EE Samples hips://github.com/javaee-­‐samples • Wow AucHon House hips://github.com/radcortez/wow-­‐aucHons
  • 47.
    Thank you for A/ending! Roberto Cortez @radcortez h/p://www.radcortez.com Ivan St. Ivanov @ivan_stefanov h/p://nosoGskills.com