Parse XML using Minidom in Python

Parse XML using Minidom in Python

xml.dom.minidom is a minimal implementation of the Document Object Model interface, with an API similar to that in other languages. It can be used to parse and manipulate XML documents in Python.

Here's a basic guide on how to parse XML using minidom:

  • Parsing XML from a String:
from xml.dom import minidom xml_string = """ <bookstore> <book> <title lang="en">Harry Potter</title> <author>J.K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book> <title lang="en">Learning XML</title> <author>John Smith</author> <year>2003</year> <price>39.95</price> </book> </bookstore> """ # Parse the XML string doc = minidom.parseString(xml_string) # Access the 'book' elements books = doc.getElementsByTagName("book") for book in books: title = book.getElementsByTagName("title")[0].firstChild.data author = book.getElementsByTagName("author")[0].firstChild.data print(f"Title: {title}, Author: {author}") 
  • Parsing XML from a File:
from xml.dom import minidom # Parse the XML file doc = minidom.parse("path_to_xml_file.xml") # Access and print the 'title' and 'author' elements books = doc.getElementsByTagName("book") for book in books: title = book.getElementsByTagName("title")[0].firstChild.data author = book.getElementsByTagName("author")[0].firstChild.data print(f"Title: {title}, Author: {author}") 
  • Accessing Attributes: Using the getAttribute method, you can retrieve the value of an attribute.
for book in books: title_element = book.getElementsByTagName("title")[0] language = title_element.getAttribute("lang") print(f"Language attribute of the title: {language}") 
  • Accessing Text: You can use the firstChild.data property to get the text content of an element:
title = book.getElementsByTagName("title")[0].firstChild.data 
  • Adding, Modifying, and Removing Elements and Attributes: With minidom, you can also add, modify, or remove elements and attributes, though it's a bit more involved than simple parsing.

While minidom is suitable for smaller XML documents and simpler tasks, if you need more advanced XML parsing capabilities or are dealing with larger XML documents, consider using the xml.etree.ElementTree module or the third-party library lxml.


More Tags

todataurl unlink r-markdown keras-layer .net-core-3.0 invariantculture firebase-admin bokeh jmeter jdbc

More Programming Guides

Other Guides

More Programming Examples