Basic XML Processing In Scala Neelkanth Sachdeva Consultant / Software Engineer Knoldus Software LLP , New Delhi neelkanthsachdeva.wordpress.com neelkanth@knoldus.com
What is XML ? → XML is a form of semi-structured data. → It is more structured than plain strings, because it organizes the contents of the data into a tree. → There are many forms of semi-structured data, but XML is the most widely used.
XML overview → XML is built out of two basic elements : 1. Text 2. Tags Text : As usual, any sequence of characters. Tags: Consist of a less-than sign,an alphanumeric label, and a greater than sign.
Writing XML Tags ● There is a shorthand notation for a start tag followed immediately by its matching end tag. ● Simply write one tag with a slash put after the tag’s label. Such a tag comprises an empty element. e.g <pod>Three <peas/> in the </pod> ● Start tags can have attributes attached to them. e.g <pod peas="3" strings="true"/>
XML literals Scala lets you type in XML as a literal anywhere that an expression is valid. Simply type a start tag and then continue writing XML content. The compile will go into an XML-input mode and will read content as XML until it sees the end tag matching the start tag you began with.
Important XML Classes Class Node is the abstract superclass of all XML node classes. Class Text is a node holding just text. For example, the “Here” part of <a>Here</a> is of class Text. Class NodeSeq holds a sequence of nodes.
Evaluating Scala Code
Example of XML
Taking XML apart Extracting text : By calling the text method on any XML node you retrieve all of the text within that node, minus any element tags.
Extracting sub-elements : If you want to find a sub-element by tag name, simply call with the name of the tag: You can do a “deep search” and look through sub-sub-elements, etc., by using instead of the operator.
Extracting attributes: You can extract tag attributes using the same and methods. Simply put an at sign (@) before the attribute name:
Runtime Representation XML data is represented as labeled trees. You can conveniently create such labeled nodes using standard XML syntax. Consider the following XML document:
<html> <head> <title>Hello XHTML world</title> </head> <body> <h1>Hello world</h1> <p><a href="http://scala- lang.org/">Scala</a> talks XHTML</p> </body> </html> This document can be created by the following Scala program as :
object XMLTest1 extends Application { val page = <html> <head> <title>Hello XHTML world</title> </head> <body> <h1>Hello world</h1> <p><a href="scala-lang.org">Scala</a> talks XHTML</p> </body> </html>; println(page.toString()) }
It is possible to mix Scala expressions and XML : object XMLTest2 extends Application { import scala.xml._ val df = java.text.DateFormat.getDateInstance val dateString = df.format(new java.util.Date) def theDate(name: String) = <dateMsg addressedTo={ name }> Hello, { name }! Today is { dateString } </dateMsg>; println(theDate("Neelkanth Sachdeva").toString) }
Pattern matching on XML Sometimes we face a situation that there are multiple kinds of records within the data. In these kind of scenarios we used to go with pattern matching on XML.
object XMLTest3 { def search(node: scala.xml.Node): String = node match { case <a>{ contents }</a> => "It's an a Catagory Item & The Item Is : " + contents case <b>{ contents }</b> => "It's as b Catagory Item & The Item Is : " + contents case _ => "It's something else." } def main(args: Array[String]) { println(search(<a>Apple</a>)) println(search(<b>Mango</b>)) } }
Cheers

Xml processing in scala

  • 1.
    Basic XML ProcessingIn Scala Neelkanth Sachdeva Consultant / Software Engineer Knoldus Software LLP , New Delhi neelkanthsachdeva.wordpress.com neelkanth@knoldus.com
  • 2.
    What is XML? → XML is a form of semi-structured data. → It is more structured than plain strings, because it organizes the contents of the data into a tree. → There are many forms of semi-structured data, but XML is the most widely used.
  • 3.
    XML overview → XMLis built out of two basic elements : 1. Text 2. Tags Text : As usual, any sequence of characters. Tags: Consist of a less-than sign,an alphanumeric label, and a greater than sign.
  • 4.
    Writing XML Tags ● There is a shorthand notation for a start tag followed immediately by its matching end tag. ● Simply write one tag with a slash put after the tag’s label. Such a tag comprises an empty element. e.g <pod>Three <peas/> in the </pod> ● Start tags can have attributes attached to them. e.g <pod peas="3" strings="true"/>
  • 5.
    XML literals Scala letsyou type in XML as a literal anywhere that an expression is valid. Simply type a start tag and then continue writing XML content. The compile will go into an XML-input mode and will read content as XML until it sees the end tag matching the start tag you began with.
  • 7.
    Important XML Classes ClassNode is the abstract superclass of all XML node classes. Class Text is a node holding just text. For example, the “Here” part of <a>Here</a> is of class Text. Class NodeSeq holds a sequence of nodes.
  • 8.
  • 9.
  • 10.
    Taking XML apart Extractingtext : By calling the text method on any XML node you retrieve all of the text within that node, minus any element tags.
  • 12.
    Extracting sub-elements : Ifyou want to find a sub-element by tag name, simply call with the name of the tag: You can do a “deep search” and look through sub-sub-elements, etc., by using instead of the operator.
  • 15.
    Extracting attributes: You canextract tag attributes using the same and methods. Simply put an at sign (@) before the attribute name:
  • 18.
    Runtime Representation XML datais represented as labeled trees. You can conveniently create such labeled nodes using standard XML syntax. Consider the following XML document:
  • 19.
    <html> <head> <title>Hello XHTML world</title> </head> <body> <h1>Hello world</h1> <p><a href="http://scala- lang.org/">Scala</a> talks XHTML</p> </body> </html> This document can be created by the following Scala program as :
  • 20.
    object XMLTest1 extendsApplication { val page = <html> <head> <title>Hello XHTML world</title> </head> <body> <h1>Hello world</h1> <p><a href="scala-lang.org">Scala</a> talks XHTML</p> </body> </html>; println(page.toString()) }
  • 21.
    It is possibleto mix Scala expressions and XML : object XMLTest2 extends Application { import scala.xml._ val df = java.text.DateFormat.getDateInstance val dateString = df.format(new java.util.Date) def theDate(name: String) = <dateMsg addressedTo={ name }> Hello, { name }! Today is { dateString } </dateMsg>; println(theDate("Neelkanth Sachdeva").toString) }
  • 22.
    Pattern matching onXML Sometimes we face a situation that there are multiple kinds of records within the data. In these kind of scenarios we used to go with pattern matching on XML.
  • 23.
    object XMLTest3 { def search(node: scala.xml.Node): String = node match { case <a>{ contents }</a> => "It's an a Catagory Item & The Item Is : " + contents case <b>{ contents }</b> => "It's as b Catagory Item & The Item Is : " + contents case _ => "It's something else." } def main(args: Array[String]) { println(search(<a>Apple</a>)) println(search(<b>Mango</b>)) } }
  • 24.