Swipe Python - XML Processing
The Extensible Markup Language (XML) is a markup language much like HTML or SGML. This is recommended by the World Wide Web Consortium and available as an open standard. XML is extremely useful for keeping track of small to medium amounts of data without requiring a SQL-based backbone. Python - XML Processing
The Python standard library provides a minimal but useful set of interfaces to work with XML. The two most basic and broadly used APIs to XML data are the SAX and DOM interfaces. Simple API for XML (SAX) Document Object Model (DOM) API XML Parser Architectures and APIs
Simple API for XML (SAX) Here, you register callbacks for events of interest and then let the parser proceed through the document. This is useful when your documents are large or you have memory limitations, it parses the file as it reads it from disk and the entire file is never stored in memory. Document Object Model (DOM) API This is a World Wide Web Consortium recommendation wherein the entire file is read into memory and stored in a hierarchical (tree-based) form to represent all the features of an XML document.
SAX obviously cannot process information as fast as DOM can when working with large files. On the other hand, using DOM exclusively can really kill your resources, especially if used on a lot of small files. SAX is read-only, while DOM allows changes to the XML file. Since these two different APIs literally complement each other, there is no reason why you cannot use them both for large projects.
Parsing XML with SAX APIs SAX is a standard interface for event-driven XML parsing. Parsing XML with SAX generally requires you to create your own ContentHandler by subclassing xml.sax.ContentHandler. Your ContentHandler handles the particular tags and attributes of your flavor(s) of XML. A ContentHandler object provides methods to handle various parsing events. Its owning parser calls ContentHandler methods as it parses the XML file.
The make_parser Method Following method creates a new parser object and returns it. The parser object created will be of the first parser type the system finds. xml.sax.make_parser( [parser_list] ) parser_list − The optional argument consisting of a list of parsers to use which must all implement the make_parser method. Here is the detail of the parameters −
The parse Method Following method creates a SAX parser and uses it to parse a document. xml.sax.parse( xmlfile, contenthandler[, errorhandler]) xmlfile − This is the name of the XML file to read from. contenthandler − This must be a ContentHandler object. errorhandler − If specified, errorhandler must be a SAX ErrorHandler object. Here is the detail of the parameters −
The parseString Method There is one more method to create a SAX parser and to parse the specified XML string. xml.sax.parseString(xmlstring, contenthandler[, errorhandler]) xmlstring − This is the name of the XML string to read from. contenthandler − This must be a ContentHandler object. errorhandler − If specified, errorhandler must be a SAX ErrorHandler object. Here is the detail of the parameters −
Python - Multithreaded Programming Python - GUI Programming (Tkinter) Stay Tuned with Topics for next Post

Python xml processing

  • 1.
  • 2.
    The Extensible MarkupLanguage (XML) is a markup language much like HTML or SGML. This is recommended by the World Wide Web Consortium and available as an open standard. XML is extremely useful for keeping track of small to medium amounts of data without requiring a SQL-based backbone. Python - XML Processing
  • 3.
    The Python standardlibrary provides a minimal but useful set of interfaces to work with XML. The two most basic and broadly used APIs to XML data are the SAX and DOM interfaces. Simple API for XML (SAX) Document Object Model (DOM) API XML Parser Architectures and APIs
  • 4.
    Simple API forXML (SAX) Here, you register callbacks for events of interest and then let the parser proceed through the document. This is useful when your documents are large or you have memory limitations, it parses the file as it reads it from disk and the entire file is never stored in memory. Document Object Model (DOM) API This is a World Wide Web Consortium recommendation wherein the entire file is read into memory and stored in a hierarchical (tree-based) form to represent all the features of an XML document.
  • 5.
    SAX obviously cannotprocess information as fast as DOM can when working with large files. On the other hand, using DOM exclusively can really kill your resources, especially if used on a lot of small files. SAX is read-only, while DOM allows changes to the XML file. Since these two different APIs literally complement each other, there is no reason why you cannot use them both for large projects.
  • 6.
    Parsing XML withSAX APIs SAX is a standard interface for event-driven XML parsing. Parsing XML with SAX generally requires you to create your own ContentHandler by subclassing xml.sax.ContentHandler. Your ContentHandler handles the particular tags and attributes of your flavor(s) of XML. A ContentHandler object provides methods to handle various parsing events. Its owning parser calls ContentHandler methods as it parses the XML file.
  • 7.
    The make_parser Method Followingmethod creates a new parser object and returns it. The parser object created will be of the first parser type the system finds. xml.sax.make_parser( [parser_list] ) parser_list − The optional argument consisting of a list of parsers to use which must all implement the make_parser method. Here is the detail of the parameters −
  • 8.
    The parse Method Followingmethod creates a SAX parser and uses it to parse a document. xml.sax.parse( xmlfile, contenthandler[, errorhandler]) xmlfile − This is the name of the XML file to read from. contenthandler − This must be a ContentHandler object. errorhandler − If specified, errorhandler must be a SAX ErrorHandler object. Here is the detail of the parameters −
  • 9.
    The parseString Method Thereis one more method to create a SAX parser and to parse the specified XML string. xml.sax.parseString(xmlstring, contenthandler[, errorhandler]) xmlstring − This is the name of the XML string to read from. contenthandler − This must be a ContentHandler object. errorhandler − If specified, errorhandler must be a SAX ErrorHandler object. Here is the detail of the parameters −
  • 10.
    Python - Multithreaded Programming Python- GUI Programming (Tkinter) Stay Tuned with Topics for next Post