SQL Pattern Recognition (boldly go beyond analytical) from 12.1 Lucas Jellema speaks at conferences and user group events, writes blog articles (on the AMIS Technology Blog) and has published two books with Oracle Press (on Oracle SOA Suite). His interests range from client side UI and JavaScript through integration middleware to Database development and platform design. Creative usages of SQL and PL/SQL are among his favorite pastimes. In his day time job, Lucas is CTO and architecture consultant at AMIS in The Netherlands and is affiliated with the Dutch Oracle User Group (OGh). Lucas Jellema @lucasjellema Oracle ACE Director
Patterns • Special, significant sequences of data
Pattern Matching • Discover special patterns in [potentially pretty big] data sets • Crucial requirement: records have to be ordered in some way – Frequently by time [of observation] – Could be by the location along some axis
Presenting: Modern Art • “Who’s afraid of red, yellow and blue” – Barnett Newman, Stedelijk Museum, Amsterdam
Find art in the car park • Find if we have cars parked according to the red-yellow-blue pattern of the painting Parking Space Car color
Analytical Functions • Solution with Lag or Lead • With LEAD it is easy to compare a row with its successor(s) – As long as the pattern is fixed, LEAD will suffice with look_ahead_cars as ( SELECT c.* -- for each car, find next and next after that , lead(car_color,1) over (order by parking_space) next_color , lead(car_color,2) over (order by parking_space) sec_nxt_color FROM parked_cars c ) select parking_space from look_ahead_cars where car_color ='red' –- for each red car and next_color ='yellow' -- check if next is yellow and sec_nxt_color='blue' –- and the one after that is blue
Match Recognize • New operator in 12c: MATCH_RECOGNIZE – Specifically provided for pattern matching – Pretty fast and very versatile
Match Recognize • New operator in 12c: MATCH_RECOGNIZE – Specifically provided for pattern matching SELECT * -- produces columns from parked_cars and from match_recognize FROM parked_cars -- record set to find pattern in MATCH_RECOGNIZE ( ORDER BY parking_space -- specify ordering of records for pattern MEASURES RED.parking_space AS red_space -- results to be produced , MATCH_NUMBER() AS match_num -- umptieth match ALL ROWS PER MATCH –- all records in pattern or only the first PATTERN (RED YELLOW BLUE) –- the pattern to locate DEFINE –- row conditions to be used in pattern RED AS RED.car_color ='red', –- match on row with red car YELLOW AS YELLOW.car_color ='yellow', –- match on yellow car BLUE AS BLUE.car_color ='blue‘–- match on blue car ) MR ORDER BY MR.red_space , MR.parking_space
Match Recognize • New operator in 12c: MATCH_RECOGNIZE – Specifically provided for pattern matching SELECT * -- produces columns from parked_cars and from match_recognize FROM parked_cars -- record set to find pattern in MATCH_RECOGNIZE ( ORDER BY parking_space -- specify ordering of records for pattern MEASURES RED.parking_space AS red_space -- results to be produced , MATCH_NUMBER() AS match_num -- umptieth match ALL ROWS PER MATCH –- all records in pattern or only the first PATTERN (RED YELLOW BLUE) –- the pattern to locate DEFINE –- row conditions to be used in pattern RED AS RED.car_color = 'red', –- identify row with red car YELLOW AS YELLOW.car_color = 'yellow', –- locate row with yellow car BLUE AS BLUE.car_color = 'blue' –- record with blue car ) MR ORDER BY MR.red_space , MR.parking_space
Up the ante – a little • Suppose we also want to find blocks of cars – For example: red-red-red-yellow-yellow-blue-blue • And we accept white cars interspersed between the colored ones – So red-red-white-yellow-white-yellow-blue also satisfies the pattern • Lag/Lead solution quickly becomes unwieldy
Extending the pattern match_recognize solution SELECT * -- produces columns from parked_cars and from match_recognize FROM parked_cars -- record set to find pattern in MATCH_RECOGNIZE ( ORDER BY parking_space -- specify ordering of records for pattern MEASURES RED.parking_space AS red_space -- results to be produced , MATCH_NUMBER() AS match_num -- umptieth match ALL ROWS PER MATCH –- all records in pattern or only the first PATTERN (RED+ WHITE* YELLOW+ WHITE* BLUE+) DEFINE –- row conditions to be used in pattern RED AS RED.car_color ='red', –- match on row with red car YELLOW AS YELLOW.car_color ='yellow', –- match on yellow car BLUE AS BLUE.car_color ='blue'–- match on blue car WHITE AS WHITE.car_color ='white'–- match on white car ) MR ORDER BY MR.red_space , MR.parking_space
Extending the pattern match_recognize solution SELECT * -- produces columns from parked_cars and from match_recognize FROM parked_cars -- record set to find pattern in MATCH_RECOGNIZE ( ORDER BY parking_space -- specify ordering of records for pattern MEASURES RED.parking_space AS red_space -- results to be produced , MATCH_NUMBER() AS match_num -- umptieth match ALL ROWS PER MATCH –- all records in pattern or only the first PATTERN (RED+ WHITE* YELLOW+ WHITE* BLUE+) DEFINE –- row conditions to be used in pattern RED AS RED.car_color ='red', –- match on row with red car YELLOW AS YELLOW.car_color ='yellow', –- match on yellow car BLUE AS BLUE.car_color ='blue'–- match on blue car WHITE AS WHITE.car_color ='white'–- match on white car ) MR ORDER BY MR.red_space , MR.parking_space
Use a regular expression to describe sought after pattern • Supported operators for the pattern clause include: – * for 0 or more iterations – + for 1 or more iterations – ? for 0 or 1 iterations – { n } for exactly n iterations (n > 0) – { n, } for n or more iterations (n >= 0) – { n, m } for between n and m (inclusive) iterations (0 <= n <= m, 0 < m) – { , m } for between 0 and m (inclusive) iterations (m > 0) – reluctant qualifiers - *?, +?, ??, {n}?, {n,}?, { n, m }?, {,m}? – | for alternation (OR) – grouping using () parentheses – exclusion using {- and -} – empty pattern using () – ^ and $ for start and end of a partition PATTERN (RED+ WHITE* YELLOW+ WHITE* BLUE+)
Elements of Match_Recognize • FIRST, LAST, NEXT, PREV • MATCH_NUMBER() • CLASSIFIER() • COUNT, SUM, AVG, MAX, MIN • FINAL or RUNNING • PER MATCH – ALL ROWS or ONE ROW • AFTER MATCH – SKIP TO LAST, TO NEXT, FIRST, PAST LAST ROW
Did we ever hire three employees in a row in the same job? SELECT * FROM EMP MATCH_RECOGNIZE ( ORDER BY hiredate MEASURES SAME_JOB.hiredate AS hireday , MATCH_NUMBER() AS match_num ALL ROWS PER MATCH PATTERN (SAME_JOB{3}) DEFINE SAME_JOB AS SAME_JOB.job = FIRST(SAME_JOB.job) ) MR
Did we ever hire three employees in a row in the same job? SELECT * FROM EMP MATCH_RECOGNIZE ( ORDER BY hiredate MEASURES SAME_JOB.hiredate AS hireday , MATCH_NUMBER() AS match_num ALL ROWS PER MATCH PATTERN (SAME_JOB{3}) DEFINE SAME_JOB AS SAME_JOB.job = FIRST(SAME_JOB.job) ) MR
Conclusion • Cool stuff • Very fast • Nifty tool for the SQL toolbox • Useful for analysis of database activity • Takes us beyond Analytical Functions for advanced record interdependencies

Oracle Database 12c - Introducing SQL Pattern Recognition through MATCH_RECOGNIZE (Oracle OpenWorld 2016)

  • 1.
    SQL Pattern Recognition (boldlygo beyond analytical) from 12.1 Lucas Jellema speaks at conferences and user group events, writes blog articles (on the AMIS Technology Blog) and has published two books with Oracle Press (on Oracle SOA Suite). His interests range from client side UI and JavaScript through integration middleware to Database development and platform design. Creative usages of SQL and PL/SQL are among his favorite pastimes. In his day time job, Lucas is CTO and architecture consultant at AMIS in The Netherlands and is affiliated with the Dutch Oracle User Group (OGh). Lucas Jellema @lucasjellema Oracle ACE Director
  • 2.
  • 3.
    Pattern Matching • Discoverspecial patterns in [potentially pretty big] data sets • Crucial requirement: records have to be ordered in some way – Frequently by time [of observation] – Could be by the location along some axis
  • 4.
    Presenting: Modern Art •“Who’s afraid of red, yellow and blue” – Barnett Newman, Stedelijk Museum, Amsterdam
  • 5.
    Find art inthe car park • Find if we have cars parked according to the red-yellow-blue pattern of the painting Parking Space Car color
  • 6.
    Analytical Functions • Solutionwith Lag or Lead • With LEAD it is easy to compare a row with its successor(s) – As long as the pattern is fixed, LEAD will suffice with look_ahead_cars as ( SELECT c.* -- for each car, find next and next after that , lead(car_color,1) over (order by parking_space) next_color , lead(car_color,2) over (order by parking_space) sec_nxt_color FROM parked_cars c ) select parking_space from look_ahead_cars where car_color ='red' –- for each red car and next_color ='yellow' -- check if next is yellow and sec_nxt_color='blue' –- and the one after that is blue
  • 7.
    Match Recognize • Newoperator in 12c: MATCH_RECOGNIZE – Specifically provided for pattern matching – Pretty fast and very versatile
  • 8.
    Match Recognize • Newoperator in 12c: MATCH_RECOGNIZE – Specifically provided for pattern matching SELECT * -- produces columns from parked_cars and from match_recognize FROM parked_cars -- record set to find pattern in MATCH_RECOGNIZE ( ORDER BY parking_space -- specify ordering of records for pattern MEASURES RED.parking_space AS red_space -- results to be produced , MATCH_NUMBER() AS match_num -- umptieth match ALL ROWS PER MATCH –- all records in pattern or only the first PATTERN (RED YELLOW BLUE) –- the pattern to locate DEFINE –- row conditions to be used in pattern RED AS RED.car_color ='red', –- match on row with red car YELLOW AS YELLOW.car_color ='yellow', –- match on yellow car BLUE AS BLUE.car_color ='blue‘–- match on blue car ) MR ORDER BY MR.red_space , MR.parking_space
  • 9.
    Match Recognize • Newoperator in 12c: MATCH_RECOGNIZE – Specifically provided for pattern matching SELECT * -- produces columns from parked_cars and from match_recognize FROM parked_cars -- record set to find pattern in MATCH_RECOGNIZE ( ORDER BY parking_space -- specify ordering of records for pattern MEASURES RED.parking_space AS red_space -- results to be produced , MATCH_NUMBER() AS match_num -- umptieth match ALL ROWS PER MATCH –- all records in pattern or only the first PATTERN (RED YELLOW BLUE) –- the pattern to locate DEFINE –- row conditions to be used in pattern RED AS RED.car_color = 'red', –- identify row with red car YELLOW AS YELLOW.car_color = 'yellow', –- locate row with yellow car BLUE AS BLUE.car_color = 'blue' –- record with blue car ) MR ORDER BY MR.red_space , MR.parking_space
  • 10.
    Up the ante– a little • Suppose we also want to find blocks of cars – For example: red-red-red-yellow-yellow-blue-blue • And we accept white cars interspersed between the colored ones – So red-red-white-yellow-white-yellow-blue also satisfies the pattern • Lag/Lead solution quickly becomes unwieldy
  • 11.
    Extending the pattern match_recognizesolution SELECT * -- produces columns from parked_cars and from match_recognize FROM parked_cars -- record set to find pattern in MATCH_RECOGNIZE ( ORDER BY parking_space -- specify ordering of records for pattern MEASURES RED.parking_space AS red_space -- results to be produced , MATCH_NUMBER() AS match_num -- umptieth match ALL ROWS PER MATCH –- all records in pattern or only the first PATTERN (RED+ WHITE* YELLOW+ WHITE* BLUE+) DEFINE –- row conditions to be used in pattern RED AS RED.car_color ='red', –- match on row with red car YELLOW AS YELLOW.car_color ='yellow', –- match on yellow car BLUE AS BLUE.car_color ='blue'–- match on blue car WHITE AS WHITE.car_color ='white'–- match on white car ) MR ORDER BY MR.red_space , MR.parking_space
  • 12.
    Extending the pattern match_recognizesolution SELECT * -- produces columns from parked_cars and from match_recognize FROM parked_cars -- record set to find pattern in MATCH_RECOGNIZE ( ORDER BY parking_space -- specify ordering of records for pattern MEASURES RED.parking_space AS red_space -- results to be produced , MATCH_NUMBER() AS match_num -- umptieth match ALL ROWS PER MATCH –- all records in pattern or only the first PATTERN (RED+ WHITE* YELLOW+ WHITE* BLUE+) DEFINE –- row conditions to be used in pattern RED AS RED.car_color ='red', –- match on row with red car YELLOW AS YELLOW.car_color ='yellow', –- match on yellow car BLUE AS BLUE.car_color ='blue'–- match on blue car WHITE AS WHITE.car_color ='white'–- match on white car ) MR ORDER BY MR.red_space , MR.parking_space
  • 13.
    Use a regularexpression to describe sought after pattern • Supported operators for the pattern clause include: – * for 0 or more iterations – + for 1 or more iterations – ? for 0 or 1 iterations – { n } for exactly n iterations (n > 0) – { n, } for n or more iterations (n >= 0) – { n, m } for between n and m (inclusive) iterations (0 <= n <= m, 0 < m) – { , m } for between 0 and m (inclusive) iterations (m > 0) – reluctant qualifiers - *?, +?, ??, {n}?, {n,}?, { n, m }?, {,m}? – | for alternation (OR) – grouping using () parentheses – exclusion using {- and -} – empty pattern using () – ^ and $ for start and end of a partition PATTERN (RED+ WHITE* YELLOW+ WHITE* BLUE+)
  • 14.
    Elements of Match_Recognize •FIRST, LAST, NEXT, PREV • MATCH_NUMBER() • CLASSIFIER() • COUNT, SUM, AVG, MAX, MIN • FINAL or RUNNING • PER MATCH – ALL ROWS or ONE ROW • AFTER MATCH – SKIP TO LAST, TO NEXT, FIRST, PAST LAST ROW
  • 15.
    Did we everhire three employees in a row in the same job? SELECT * FROM EMP MATCH_RECOGNIZE ( ORDER BY hiredate MEASURES SAME_JOB.hiredate AS hireday , MATCH_NUMBER() AS match_num ALL ROWS PER MATCH PATTERN (SAME_JOB{3}) DEFINE SAME_JOB AS SAME_JOB.job = FIRST(SAME_JOB.job) ) MR
  • 16.
    Did we everhire three employees in a row in the same job? SELECT * FROM EMP MATCH_RECOGNIZE ( ORDER BY hiredate MEASURES SAME_JOB.hiredate AS hireday , MATCH_NUMBER() AS match_num ALL ROWS PER MATCH PATTERN (SAME_JOB{3}) DEFINE SAME_JOB AS SAME_JOB.job = FIRST(SAME_JOB.job) ) MR
  • 17.
    Conclusion • Cool stuff •Very fast • Nifty tool for the SQL toolbox • Useful for analysis of database activity • Takes us beyond Analytical Functions for advanced record interdependencies