Applied XML Programming for Microsoft .NET PART 1
The .NET XML Parsing Model 1. XML is a natural element of all forms of programming life. 2. XML in the .NET Framework The .NET Framework XML core classes can be categorized according to their functions: 1. reading and writing documents 2. validating documents 3. navigating and selecting nodes 4. managing schema information 5. performing document transformations
The assembly in which the whole XML .NET Framework is implemented is system.xml.dll The most commonly used namespaces are listed here: 1. System.Xml 2. System.Xml.Schema 3. System.Xml.XPath 4. System.Xml.Xsl
The .NET Framework also provides for XML object serialization. The classes involved with this functionality are grouped in the System.Xml.Serialization namespace. XML serialization writes objects to, and reads them from, XML documents. This kind of serialization is particularly useful over the Web in combination with the Simple Object Access Protocol (SOAP) and within the boundaries of .NET Framework XML Web services.
Areas of the .NET Framework in Which XML Is Key Category Description ADO.NET Data container objects (for example, the DataSet object) are always transferred and remoted via XML Configuration Application settings are stored in XML files, making use of predefined and user-defined section readers. Remoting Remote .NET Framework objects can be accessed by using SOAP packets to prepare and perform the call. Web services SOAP is a lightweight XML protocol that Web services use for the exchange of information in a decentralized, distributed environment.
XML parsing The core classes providing for XML parsing and manipulation through both the stream-based API and the XML Document Object Model (XMLDOM). XML serialization Supplies the ability to save and restore living instances of objects to and from XML documents
Classes for Parsing The available XML parsers fall into one of two main categories: 1. tree-based parsers 2. event-based parsers
XML and ADO.NET The interaction between ADO.NET classes and XML documents takes one of two forms: Serialization of ADO.NET objects (in particular, the DataSet object) to XML documents and corresponding deserialization. Data can be saved to XML in a variety of formats, with or without schema information, as a full snapshot of the in-memory data including pending changes and errors, or with just the current instance of the data
A dual-access model that lets you access and update the same piece of data either through a hierarchical programming interface or using the ADO.NET relational API. Basically, you can transform a DataSet object into an XMLDOM object and view the XMLDOM's subtrees as tables merged with the DataSet object's tables.
The .NET Framework XML API The essence of XML in the .NET Framework is found in two abstract classes— XmlReader and XmlWriter. These classes are at the core of all other .NET Framework XML classes, including the XMLDOM classes, and are used extensively by various subsystems to parse or generate XML text. For example, ADO.NET data adapters retrieve the data to store in a DataSet object using a database reader, and the DataSet object serializes its contents to the DiffGram format using an XmlTextWriter object, which derives from XmlWriter
The XML API for the .NET Framework comprises the following set of functionalities: 1. XML readers 2. XML writers 3. XML document classes
Streams can be read and written using made-to-measure reader and writer classes. The base classes are TextReader, TextWriter, BinaryReader, BinaryWriter, and Stream. With the exception of the binary classes, all of these classes are marked as abstract (MustInherit, if you speak Visual Basic) and cannot be directly instantiated in code. You can use abstract classes to reference living instances of derived classes, however. In the .NET Framework, base reader and writer classes find a number of concrete implementations, including StreamReader and StringReader and their writing counterparts.
XML Readers An XML reader makes externally available a programming interface through which callers can connect and pull out all the data they need. This is in no way different from what happens when you connect to a database and fetch data. The database server returns a reference to an internal object—the cursor—which manages all the query results and makes them available on demand. This statement applies regardless of the fact that the database world might provide several flavors of cursors—client, scrollable, server-side, and so on.
Readers vs. XMLDOM XML readers don't require you to keep more data in memory than you actually need. When you open the XML document, a simple logical pointer that corresponds to a node is returned. You can easily skip over nodes to locate the one you need. In doing so, you don't tax in any way the application's memory with extra data other than that required to bufferize the currently selected node.
Readers vs. SAX A SAX parser directly controls the evolution of the parsing process and pushes data to the client application. A cursor parser (that is, an XML reader), on the other hand, plays a more passive role and leaves client applications to control the process
XML Writers The .NET XML API separates parsing from editing and writing and offers a set of methods that provides effective results for performance as well as usability. When writing, you create new XML documents working at a considerably high level of abstraction and explicitly indicate the XML elements to create—nodes, attributes, comments, or processing instructions. The writer works on a stream, dumping content incrementally, one node after the next, without the random access capabilities of the XMLDOM but also without its memory footprint.
The XML Document Object API in .NET As mentioned, along with XML readers and writers, the .NET Framework also provides classes that load and edit XML documents according to the W3C DOM Level 1 and Level 2 Core. The key XMLDOM class in the .NET Framework is XmlDocument—not much different from the DOMDocument class, which you might recognize from working with MSXML
XPath Expressions and XSLT In the .NET Framework, XSLT and XPath expressions are fully supported but are implemented in classes distinct from those that parse and write XML text. This is a key feature of the overall .NET XML API. Any functionality is provided through a small hierarchy of objects, although each subtree connects and interoperates well with others.
The XMLDOM API is built on top of readers and writers, but both XSLT and XPath expressions need to have a complete and XMLDOM-based vision of the entire XML document to process it. XML readers and writers are the primitive elements of the .NET XML API. Whenever XML text must be parsed or written, all classes, directly or indirectly, refer to them. A more complex primitive element is the XMLDOM tree. Transformations and advanced queries must rely on the document in its entirety being held in memory and accessible through a well-known interface—the XMLDOM.
The XSLT Processor The key class for XSLT is XslTransform. The class works as an XSLT processor and complies with version 1.0 of the XSLT recommendation. The class has two key methods, Load and Transform, whose behavior is for the most part selfexplanatory
The XPath Query Engine XPath is a language that allows you to navigate within XML documents. Think of XPath as a general-purpose query language for addressing, sorting, and filtering both the elements and the text of an XML document.
Further Reading Further Reading 1.The W3C organization is currently working on a draft of the DOM Level 3 Core to include support for an abstract modeling schema and I/O serialization. Check out the most recent draft at http://www.w3.org/TR/2002/WD-DOM-Level3-ASLS-20020409. The approved standard—DOM Level 2 Core—is available at http://www.w3.org/TR/DOMLevel- 2.Relevant information about XML standards is available from the W3C Web site, at http://www.w3.org. If you want to learn more about the SAX specification, look at the new Web site for the SAX project, at http://www.saxproject.org.

Applied xml programming for microsoft

  • 1.
  • 2.
    The .NET XMLParsing Model 1. XML is a natural element of all forms of programming life. 2. XML in the .NET Framework The .NET Framework XML core classes can be categorized according to their functions: 1. reading and writing documents 2. validating documents 3. navigating and selecting nodes 4. managing schema information 5. performing document transformations
  • 3.
    The assembly inwhich the whole XML .NET Framework is implemented is system.xml.dll The most commonly used namespaces are listed here: 1. System.Xml 2. System.Xml.Schema 3. System.Xml.XPath 4. System.Xml.Xsl
  • 4.
    The .NET Frameworkalso provides for XML object serialization. The classes involved with this functionality are grouped in the System.Xml.Serialization namespace. XML serialization writes objects to, and reads them from, XML documents. This kind of serialization is particularly useful over the Web in combination with the Simple Object Access Protocol (SOAP) and within the boundaries of .NET Framework XML Web services.
  • 5.
    Areas of the.NET Framework in Which XML Is Key Category Description ADO.NET Data container objects (for example, the DataSet object) are always transferred and remoted via XML Configuration Application settings are stored in XML files, making use of predefined and user-defined section readers. Remoting Remote .NET Framework objects can be accessed by using SOAP packets to prepare and perform the call. Web services SOAP is a lightweight XML protocol that Web services use for the exchange of information in a decentralized, distributed environment.
  • 6.
    XML parsing Thecore classes providing for XML parsing and manipulation through both the stream-based API and the XML Document Object Model (XMLDOM). XML serialization Supplies the ability to save and restore living instances of objects to and from XML documents
  • 7.
    Classes for Parsing Theavailable XML parsers fall into one of two main categories: 1. tree-based parsers 2. event-based parsers
  • 8.
    XML and ADO.NET Theinteraction between ADO.NET classes and XML documents takes one of two forms: Serialization of ADO.NET objects (in particular, the DataSet object) to XML documents and corresponding deserialization. Data can be saved to XML in a variety of formats, with or without schema information, as a full snapshot of the in-memory data including pending changes and errors, or with just the current instance of the data
  • 9.
    A dual-access modelthat lets you access and update the same piece of data either through a hierarchical programming interface or using the ADO.NET relational API. Basically, you can transform a DataSet object into an XMLDOM object and view the XMLDOM's subtrees as tables merged with the DataSet object's tables.
  • 10.
    The .NET FrameworkXML API The essence of XML in the .NET Framework is found in two abstract classes— XmlReader and XmlWriter. These classes are at the core of all other .NET Framework XML classes, including the XMLDOM classes, and are used extensively by various subsystems to parse or generate XML text. For example, ADO.NET data adapters retrieve the data to store in a DataSet object using a database reader, and the DataSet object serializes its contents to the DiffGram format using an XmlTextWriter object, which derives from XmlWriter
  • 11.
    The XML APIfor the .NET Framework comprises the following set of functionalities: 1. XML readers 2. XML writers 3. XML document classes
  • 13.
    Streams can beread and written using made-to-measure reader and writer classes. The base classes are TextReader, TextWriter, BinaryReader, BinaryWriter, and Stream. With the exception of the binary classes, all of these classes are marked as abstract (MustInherit, if you speak Visual Basic) and cannot be directly instantiated in code. You can use abstract classes to reference living instances of derived classes, however. In the .NET Framework, base reader and writer classes find a number of concrete implementations, including StreamReader and StringReader and their writing counterparts.
  • 14.
    XML Readers An XMLreader makes externally available a programming interface through which callers can connect and pull out all the data they need. This is in no way different from what happens when you connect to a database and fetch data. The database server returns a reference to an internal object—the cursor—which manages all the query results and makes them available on demand. This statement applies regardless of the fact that the database world might provide several flavors of cursors—client, scrollable, server-side, and so on.
  • 15.
    Readers vs. XMLDOM XMLreaders don't require you to keep more data in memory than you actually need. When you open the XML document, a simple logical pointer that corresponds to a node is returned. You can easily skip over nodes to locate the one you need. In doing so, you don't tax in any way the application's memory with extra data other than that required to bufferize the currently selected node.
  • 16.
    Readers vs. SAX ASAX parser directly controls the evolution of the parsing process and pushes data to the client application. A cursor parser (that is, an XML reader), on the other hand, plays a more passive role and leaves client applications to control the process
  • 17.
    XML Writers The .NETXML API separates parsing from editing and writing and offers a set of methods that provides effective results for performance as well as usability. When writing, you create new XML documents working at a considerably high level of abstraction and explicitly indicate the XML elements to create—nodes, attributes, comments, or processing instructions. The writer works on a stream, dumping content incrementally, one node after the next, without the random access capabilities of the XMLDOM but also without its memory footprint.
  • 18.
    The XML DocumentObject API in .NET As mentioned, along with XML readers and writers, the .NET Framework also provides classes that load and edit XML documents according to the W3C DOM Level 1 and Level 2 Core. The key XMLDOM class in the .NET Framework is XmlDocument—not much different from the DOMDocument class, which you might recognize from working with MSXML
  • 19.
    XPath Expressions andXSLT In the .NET Framework, XSLT and XPath expressions are fully supported but are implemented in classes distinct from those that parse and write XML text. This is a key feature of the overall .NET XML API. Any functionality is provided through a small hierarchy of objects, although each subtree connects and interoperates well with others.
  • 21.
    The XMLDOM APIis built on top of readers and writers, but both XSLT and XPath expressions need to have a complete and XMLDOM-based vision of the entire XML document to process it. XML readers and writers are the primitive elements of the .NET XML API. Whenever XML text must be parsed or written, all classes, directly or indirectly, refer to them. A more complex primitive element is the XMLDOM tree. Transformations and advanced queries must rely on the document in its entirety being held in memory and accessible through a well-known interface—the XMLDOM.
  • 22.
    The XSLT Processor Thekey class for XSLT is XslTransform. The class works as an XSLT processor and complies with version 1.0 of the XSLT recommendation. The class has two key methods, Load and Transform, whose behavior is for the most part selfexplanatory
  • 23.
    The XPath QueryEngine XPath is a language that allows you to navigate within XML documents. Think of XPath as a general-purpose query language for addressing, sorting, and filtering both the elements and the text of an XML document.
  • 24.
    Further Reading Further Reading 1.TheW3C organization is currently working on a draft of the DOM Level 3 Core to include support for an abstract modeling schema and I/O serialization. Check out the most recent draft at http://www.w3.org/TR/2002/WD-DOM-Level3-ASLS-20020409. The approved standard—DOM Level 2 Core—is available at http://www.w3.org/TR/DOMLevel- 2.Relevant information about XML standards is available from the W3C Web site, at http://www.w3.org. If you want to learn more about the SAX specification, look at the new Web site for the SAX project, at http://www.saxproject.org.