SQLXML x Xquery using DB2 Viper amolpujari@gmail.com
RSS Generator • Uploading messages from news server to the Native XML database (DB2 Viper) • Use the uploaded data to generate RSS documents • Two approaches – Using relational database, SQLXML to construct RSS document – Using XML native data type , Xquery to construct RSS document
Relational Database Design Table1: msg1 Table2: msg_detail id News server News group grpid msgid subject author link rssdate description 0 news.persistent.co.in comp.lang.c 1 news.persistent.co.in comp.lang.java 145 news.software.ibm.com ibm.software.unicode around 4, 50 000 messages around 150 news groups Clustered index int data type Regular index varchar data type
XML Database Design Table1: msg Item (xml) xml index <msg id=‘12’ newsserver=‘news.persistent.co.in’ newsgroup=‘comp.lang.c’> <item> <title>Re: SIGPIPE - Finding the thread</title> <link><e1lqu9$2ee$1@news.intranet.pspl.co.in></link> <author>sushrut bidwai <sushrut_bidwai@persistent.co.in></author> <pubDate>Thu, 13 Apr 2006, 09:49:39 +0530</pubDate> <description>some description here…</description> </item> </msg> <msg id=‘12’ newsserver=‘news.software.ibm.com’ newsgroup=‘ibm.software.unicode’> <item> <title>Gold Mobile</title> <link><d1nl7v$4lug$5@news.boulder.ibm.com></link> <author>Nadine <Nadine.grantham@gmail.com></author> <pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate> <description>some description here…</description> </item> </msg> around 4, 50 000 xml records
SQLXML x XQuery (1) (SQLXML) : Query to generate RSS document SQLXML_01.sql complex with v1 as (select msgid,subject, author, link, description, rssdate from msg1, msg_detail where msg1.id=msg_detail.grpid big one and description like '%%' and subject like '%%' and author like '%%' big where clause and newsserver='news.persistent.co.in' and newsgroup = 'comp.lang.c' more time to construct order by msgid fetch first 15 rows only) select XMLserialize( more xml function calls XMLELEMENT(NAME "rss", XMLATTRIBUTES ( '2.0' AS "version" ), XMLELEMENT(NAME "channel", includes join XMLELEMENT(NAME "title" , 'news.persistent.co.in:comp.lang.c'), XMLELEMENT(NAME "link" , 'http://news.persistent.co'), XMLELEMENT(NAME "description" , 'The latest content from news.persistent.co.in:comp.lang.c'), XMLELEMENT(NAME "copyright" , 'Persistent'), XMLELEMENT(NAME "language" , 'en_us'), XMLELEMENT(NAME "lastBuildDate" , 'Tue, 18 Apr 2006 11:41:52 +0530'), XMLAGG( XMLELEMENT(NAME "item", XMLELEMENT( NAME "title", subject), XMLELEMENT( NAME "author", XMLELEMENT( NAME "name", author) ), XMLELEMENT( NAME "link", link), XMLELEMENT( NAME "pubDate", rssDate), XMLELEMENT( NAME "description", description) ) ORDER BY msgid ) ) execution time: 1172 (7123) ms ) as varchar(8192)) as "doc" from v1
SQLXML x XQuery (1) Equivalent (XQuery) : Query to generate RSS document XQUERY_01.sql simple xquery for $a in ( 1 to 1 ) return small one <rss version="2.0"> <channel> <title>news.persistent.co.in:comp.lang.c</title> less time to construct <link>http://news.persistent.co.in</link> <description>The latest content from news.persistent.co.in:comp.lang.c</description> <lastBuildDate>Thu, 13 Apr 2006, 17:58:13 +0530</lastBuildDate> {let $e:=( for $b in db2-fn:xmlcolumn('MSG.ITEM')/msg[@newsserver="news.persistent.co.in"][@newsgroup="comp.lang.c"] where $b/item[fn:contains(title,"")] and $b/item[fn:contains(author,"")] and $b/item[fn:contains(description,"")] order by fn:number($b/@id) descending return $b ) for $i in ( 1 to 15) return $e[$i]/item } </channel> execution time: 62034 ms </rss>
SQLXML x XQuery (2) (SQLXML) Query : list of messages where the subject contains ‘primary’ SQLXML_02.sql complex SELECT XML2CLOB( XMLELEMENT(NAME "msg", XMLATTRIBUTES(newsserver AS "NewsServer",newsgroup AS "NewsGroup",msgid AS "ID"), big one XMLELEMENT(NAME "item", XMLELEMENT(NAME "title", subject), XMLELEMENT(NAME "link", link), includes join XMLELEMENT(NAME "author", author), XMLELEMENT(NAME "PubDate", rssdate), more time to construct XMLELEMENT(NAME "description", description) ) ) more xml function calls ) FROM msg1, msg_detail WHERE subject like '%primary%' and msg1.id = msg_detail.grpid execution time: 4438(3421) ms Equivalent Xquery XQUERY_02.sql very simple xquery for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg where contains($a/item/title,"primary") small one return $a less time to construct execution time: 176714(185368) ms
SQLXML x XQuery (3) (SQLXML) Query : list last 5 messages sent by an author to the news group SQLXML_03.sql complex WITH v2 as ( SELECT * big one FROM msg1, msg_detail WHERE author like '%Shridhar%' AND msg1.id=msg_detail.grpid includes join ORDER BY msgid DESC FETCH FIRST 5 ROWS ONLY ) SELECT XML2CLOB( more time to construct XMLELEMENT(NAME root , XMLAGG( XMLELEMENT(NAME "msg", more xml function calls XMLATTRIBUTES(newsserver AS "NewsServer",newsgroup AS "NewsGroup",msgid AS "ID"), XMLELEMENT(NAME "item", XMLELEMENT(NAME "title", subject), execution time: 13598(4687) ms XMLELEMENT(NAME "link", link), XMLELEMENT(NAME "author", author), XMLELEMENT(NAME "PubDate",rssdate), XMLELEMENT(NAME "description", description) ) ) ) ) ) FROM v2 Equivalent Xquery very simple XQUERY_03.sql xquery small one let $a := for $b in db2-fn:xmlcolumn('MSG.ITEM')/msg where contains($b/item/author,"Shridhar") return $b let $c := count($a) less time to construct let $d := $c - 5 return $a [position() > $d] execution time: 197292(177634) ms
SQLXML x XQuery (4) (SQLXML) Query : list of authors and the number of messages they have sent to the group SQLXML_04.sql complex WITH v2(author,messagecount) AS ( SELECT author,count(msgid) big one FROM msg1, msg_detail WHERE msg1.id = msg_detail.grpid GROUP BY author ) includes join SELECT XML2CLOB ( XMLELEMENT(NAME "msg", XMLAGG( more time to construct XMLELEMENT(NAME "authordetails", XMLATTRIBUTES(author AS "authorname",messagecount AS "messagecount") ) more xml function calls ) ) ) execution time: 1500(1530) ms FROM v2 Equivalent Xquery XQUERY_04.sql very simple xquery let $a := db2-fn:xmlcolumn('MSG.ITEM')/msg/item/author let $b := distinct-values($a) small one ( for $e in ($b) let $d := count(for $c in db2-fn:xmlcolumn('MSG.ITEM')/msg/item where $c/author = $e less time to construct return $c ) return <result> <author>{$e}</author> This query involves nested ‘for’ loops to count the number of <messagecount>{$d}</messagecount> messages sent by a particular author. So results in higher </result> execution time. This takes more than 20 minutes to execute
Summary As compared to Relational database design, there are certain cases where Native XML database (DB2 Viper) is very useful, like: – Simple Database Design – Simple yet powerful querying ability – Easy to maintain evolving schema (in case of relational database this could lead to creation of new tables and will need reconstruction of queries, indices etc..) But one grey area about this Native XML is that it lacks performance when dealing with huge amount of data

Sqlxml vs xquery

  • 1.
    SQLXML x Xquery using DB2 Viper amolpujari@gmail.com
  • 2.
    RSS Generator • Uploading messages from news server to the Native XML database (DB2 Viper) • Use the uploaded data to generate RSS documents • Two approaches – Using relational database, SQLXML to construct RSS document – Using XML native data type , Xquery to construct RSS document
  • 3.
    Relational Database Design Table1:msg1 Table2: msg_detail id News server News group grpid msgid subject author link rssdate description 0 news.persistent.co.in comp.lang.c 1 news.persistent.co.in comp.lang.java 145 news.software.ibm.com ibm.software.unicode around 4, 50 000 messages around 150 news groups Clustered index int data type Regular index varchar data type
  • 4.
    XML Database Design Table1: msg Item (xml) xml index <msg id=‘12’ newsserver=‘news.persistent.co.in’ newsgroup=‘comp.lang.c’> <item> <title>Re: SIGPIPE - Finding the thread</title> <link><e1lqu9$2ee$1@news.intranet.pspl.co.in></link> <author>sushrut bidwai <sushrut_bidwai@persistent.co.in></author> <pubDate>Thu, 13 Apr 2006, 09:49:39 +0530</pubDate> <description>some description here…</description> </item> </msg> <msg id=‘12’ newsserver=‘news.software.ibm.com’ newsgroup=‘ibm.software.unicode’> <item> <title>Gold Mobile</title> <link><d1nl7v$4lug$5@news.boulder.ibm.com></link> <author>Nadine <Nadine.grantham@gmail.com></author> <pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate> <description>some description here…</description> </item> </msg> around 4, 50 000 xml records
  • 5.
    SQLXML x XQuery (1) (SQLXML) : Query to generate RSS document SQLXML_01.sql complex with v1 as (select msgid,subject, author, link, description, rssdate from msg1, msg_detail where msg1.id=msg_detail.grpid big one and description like '%%' and subject like '%%' and author like '%%' big where clause and newsserver='news.persistent.co.in' and newsgroup = 'comp.lang.c' more time to construct order by msgid fetch first 15 rows only) select XMLserialize( more xml function calls XMLELEMENT(NAME "rss", XMLATTRIBUTES ( '2.0' AS "version" ), XMLELEMENT(NAME "channel", includes join XMLELEMENT(NAME "title" , 'news.persistent.co.in:comp.lang.c'), XMLELEMENT(NAME "link" , 'http://news.persistent.co'), XMLELEMENT(NAME "description" , 'The latest content from news.persistent.co.in:comp.lang.c'), XMLELEMENT(NAME "copyright" , 'Persistent'), XMLELEMENT(NAME "language" , 'en_us'), XMLELEMENT(NAME "lastBuildDate" , 'Tue, 18 Apr 2006 11:41:52 +0530'), XMLAGG( XMLELEMENT(NAME "item", XMLELEMENT( NAME "title", subject), XMLELEMENT( NAME "author", XMLELEMENT( NAME "name", author) ), XMLELEMENT( NAME "link", link), XMLELEMENT( NAME "pubDate", rssDate), XMLELEMENT( NAME "description", description) ) ORDER BY msgid ) ) execution time: 1172 (7123) ms ) as varchar(8192)) as "doc" from v1
  • 6.
    SQLXML x XQuery (1) Equivalent (XQuery) : Query to generate RSS document XQUERY_01.sql simple xquery for $a in ( 1 to 1 ) return small one <rss version="2.0"> <channel> <title>news.persistent.co.in:comp.lang.c</title> less time to construct <link>http://news.persistent.co.in</link> <description>The latest content from news.persistent.co.in:comp.lang.c</description> <lastBuildDate>Thu, 13 Apr 2006, 17:58:13 +0530</lastBuildDate> {let $e:=( for $b in db2-fn:xmlcolumn('MSG.ITEM')/msg[@newsserver="news.persistent.co.in"][@newsgroup="comp.lang.c"] where $b/item[fn:contains(title,"")] and $b/item[fn:contains(author,"")] and $b/item[fn:contains(description,"")] order by fn:number($b/@id) descending return $b ) for $i in ( 1 to 15) return $e[$i]/item } </channel> execution time: 62034 ms </rss>
  • 7.
    SQLXML x XQuery (2) (SQLXML) Query : list of messages where the subject contains ‘primary’ SQLXML_02.sql complex SELECT XML2CLOB( XMLELEMENT(NAME "msg", XMLATTRIBUTES(newsserver AS "NewsServer",newsgroup AS "NewsGroup",msgid AS "ID"), big one XMLELEMENT(NAME "item", XMLELEMENT(NAME "title", subject), XMLELEMENT(NAME "link", link), includes join XMLELEMENT(NAME "author", author), XMLELEMENT(NAME "PubDate", rssdate), more time to construct XMLELEMENT(NAME "description", description) ) ) more xml function calls ) FROM msg1, msg_detail WHERE subject like '%primary%' and msg1.id = msg_detail.grpid execution time: 4438(3421) ms Equivalent Xquery XQUERY_02.sql very simple xquery for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg where contains($a/item/title,"primary") small one return $a less time to construct execution time: 176714(185368) ms
  • 8.
    SQLXML x XQuery (3) (SQLXML) Query : list last 5 messages sent by an author to the news group SQLXML_03.sql complex WITH v2 as ( SELECT * big one FROM msg1, msg_detail WHERE author like '%Shridhar%' AND msg1.id=msg_detail.grpid includes join ORDER BY msgid DESC FETCH FIRST 5 ROWS ONLY ) SELECT XML2CLOB( more time to construct XMLELEMENT(NAME root , XMLAGG( XMLELEMENT(NAME "msg", more xml function calls XMLATTRIBUTES(newsserver AS "NewsServer",newsgroup AS "NewsGroup",msgid AS "ID"), XMLELEMENT(NAME "item", XMLELEMENT(NAME "title", subject), execution time: 13598(4687) ms XMLELEMENT(NAME "link", link), XMLELEMENT(NAME "author", author), XMLELEMENT(NAME "PubDate",rssdate), XMLELEMENT(NAME "description", description) ) ) ) ) ) FROM v2 Equivalent Xquery very simple XQUERY_03.sql xquery small one let $a := for $b in db2-fn:xmlcolumn('MSG.ITEM')/msg where contains($b/item/author,"Shridhar") return $b let $c := count($a) less time to construct let $d := $c - 5 return $a [position() > $d] execution time: 197292(177634) ms
  • 9.
    SQLXML x XQuery (4) (SQLXML) Query : list of authors and the number of messages they have sent to the group SQLXML_04.sql complex WITH v2(author,messagecount) AS ( SELECT author,count(msgid) big one FROM msg1, msg_detail WHERE msg1.id = msg_detail.grpid GROUP BY author ) includes join SELECT XML2CLOB ( XMLELEMENT(NAME "msg", XMLAGG( more time to construct XMLELEMENT(NAME "authordetails", XMLATTRIBUTES(author AS "authorname",messagecount AS "messagecount") ) more xml function calls ) ) ) execution time: 1500(1530) ms FROM v2 Equivalent Xquery XQUERY_04.sql very simple xquery let $a := db2-fn:xmlcolumn('MSG.ITEM')/msg/item/author let $b := distinct-values($a) small one ( for $e in ($b) let $d := count(for $c in db2-fn:xmlcolumn('MSG.ITEM')/msg/item where $c/author = $e less time to construct return $c ) return <result> <author>{$e}</author> This query involves nested ‘for’ loops to count the number of <messagecount>{$d}</messagecount> messages sent by a particular author. So results in higher </result> execution time. This takes more than 20 minutes to execute
  • 10.
    Summary As compared toRelational database design, there are certain cases where Native XML database (DB2 Viper) is very useful, like: – Simple Database Design – Simple yet powerful querying ability – Easy to maintain evolving schema (in case of relational database this could lead to creation of new tables and will need reconstruction of queries, indices etc..) But one grey area about this Native XML is that it lacks performance when dealing with huge amount of data