程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Python xml

編輯:Python

What is? XML?

XML Extensible markup language (eXtensible Markup Language), A subset of Standard General Markup Languages , Is a kind of markup language used to mark electronic documents to make them have structure . You can learn from this website  XML course

XML Designed to transmit and store data .

XML It's a set of rules for defining semantic tags , These tags divide the document into many parts and identify them .

It's also a meta markup language , That is to say, it is used to define other fields related to specific fields 、 Semantic 、 Structured markup language, syntactic language .


Python Yes XML Parsing

common XML Programming interface has DOM and SAX, These two interfaces handle XML The way of filing is different , Of course, the use occasion is also different .

Python There are three ways to parse XML,SAX,DOM, as well as ElementTree:

1.SAX (simple API for XML )

Python The standard library contains SAX Parser ,SAX Using event driven models , By parsing XML Trigger events one by one and call user-defined callback functions to handle XML file .

2.DOM(Document Object Model)

take XML Data is parsed into a tree in memory , By operating the tree XML.

This chapter uses XML Example files  movies.xml  The contents are as follows :

example

<collection shelf="New Arrivals">
<movie title="Enemy Behind">
   <type>War, Thriller</type>
   <format>DVD</format>
   <year>2003</year>
   <rating>PG</rating>
   <stars>10</stars>
   <description>Talk about a US-Japan war</description>
</movie>
<movie title="Transformers">
   <type>Anime, Science Fiction</type>
   <format>DVD</format>
   <year>1989</year>
   <rating>R</rating>
   <stars>8</stars>
   <description>A schientific fiction</description>
</movie>
   <movie title="Trigun">
   <type>Anime, Action</type>
   <format>DVD</format>
   <episodes>4</episodes>
   <rating>PG</rating>
   <stars>10</stars>
   <description>Vash the Stampede!</description>
</movie>
<movie title="Ishtar">
   <type>Comedy</type>
   <format>VHS</format>
   <rating>PG</rating>
   <stars>2</stars>
   <description>Viewable boredom</description>
</movie>
</collection>


Python Use SAX analysis xml

SAX It's event driven API.

utilize SAX analysis XML The document involves two parts :  Parser and Event handler .

The parser is responsible for reading XML file , And send events... To the event handler , Such as element start and element end events .

The event handler is responsible for responding to events , To deliver XML Data processing .

  • 1、 Processing large files ;
  • 2、 Just part of the file , Or just get specific information from the file .
  • 3、 When you want to build your own object model .

stay Python Use in sax Method handling xml We need to introduce xml.sax Medium parse function , also xml.sax.handler Medium ContentHandler.

ContentHandler Class method introduction

characters(content) Method

Timing of invocation :

Start with the line , Before a label is encountered , There are characters ,content The value of is these strings .

From a label , Before meeting the next tag , There are characters ,content The value of is these strings .

From a label , Before the line terminator , There are characters ,content The value of is these strings .

Tags can be start tags , It can also be an end tag .

startDocument() Method

Called when the document starts .

endDocument() Method

Called when the parser reaches the end of the document .

startElement(name, attrs) Method

encounter XML Call... When starting the tag ,name It's the name of the label ,attrs Is the attribute value Dictionary of the tag .

endElement(name) Method

encounter XML Call... At the end of the tag .


make_parser Method

The following method creates a new parser object and returns .

xml.sax.make_parser( [parser_list] )

Parameter description :

  • parser_list - Optional parameters , Parser list

parser Method

Here's how to create a SAX Parser and parse xml file :

xml.sax.parse( xmlfile, contenthandler[, errorhandler])

Parameter description :

  • xmlfile - xml file name
  • contenthandler - Must be a ContentHandler The object of
  • errorhandler - If this parameter is specified ,errorhandler Must be a SAX ErrorHandler object

parseString Method

parseString Method to create a XML Parser and parse xml character string :

xml.sax.parseString(xmlstring, contenthandler[, errorhandler])

Parameter description :

  • xmlstring - xml character string
  • contenthandler - Must be a ContentHandler The object of
  • errorhandler - If this parameter is specified ,errorhandler Must be a SAX ErrorHandler object

Python analysis XML example

example

#!/usr/bin/python3

import xml.sax

class MovieHandler( xml.sax.ContentHandler ):
   def __init__(self):
      self.CurrentData = ""
      self.type = ""
      self.format = ""
      self.year = ""
      self.rating = ""
      self.stars = ""
      self.description = ""

   # Element starts calling
   def startElement(self, tag, attributes):
      self.CurrentData = tag
      if tag == "movie":
         print ("*****Movie*****")
         title = attributes["title"]
         print ("Title:", title)

   # Element end call
   def endElement(self, tag):
      if self.CurrentData == "type":
         print ("Type:", self.type)
      elif self.CurrentData == "format":
         print ("Format:", self.format)
      elif self.CurrentData == "year":
         print ("Year:", self.year)
      elif self.CurrentData == "rating":
         print ("Rating:", self.rating)
      elif self.CurrentData == "stars":
         print ("Stars:", self.stars)
      elif self.CurrentData == "description":
         print ("Description:", self.description)
      self.CurrentData = ""

   # Call... When reading characters
   def characters(self, content):
      if self.CurrentData == "type":
         self.type = content
      elif self.CurrentData == "format":
         self.format = content
      elif self.CurrentData == "year":
         self.year = content
      elif self.CurrentData == "rating":
         self.rating = content
      elif self.CurrentData == "stars":
         self.stars = content
      elif self.CurrentData == "description":
         self.description = content
 
if ( __name__ == "__main__"):
   
   # Create a XMLReader
   parser = xml.sax.make_parser()
   # Close namespace
   parser.setFeature(xml.sax.handler.feature_namespaces, 0)

   # rewrite ContextHandler
   Handler = MovieHandler()
   parser.setContentHandler( Handler )
   
   parser.parse("movies.xml")

The above code execution results are as follows :

*****Movie*****
Title: Enemy Behind
Type: War, Thriller
Format: DVD
Year: 2003
Rating: PG
Stars: 10
Description: Talk about a US-Japan war
*****Movie*****
Title: Transformers
Type: Anime, Science Fiction
Format: DVD
Year: 1989
Rating: R
Stars: 8
Description: A schientific fiction
*****Movie*****
Title: Trigun
Type: Anime, Action
Format: DVD
Rating: PG
Stars: 10
Description: Vash the Stampede!
*****Movie*****
Title: Ishtar
Type: Comedy
Format: VHS
Rating: PG
Stars: 2
Description: Viewable boredom

complete SAX API Please refer to the document Python SAX APIs


Use xml.dom analysis xml

File object model (Document Object Model, abbreviation DOM), yes W3C The standard programming interface recommended by the organization to handle extensible markup languages .

One DOM The parser is parsing a XML When the document , Read the entire document at once , Save all elements of the document in a tree structure in memory , Then you can use DOM Different functions are provided to read or modify the content and structure of the document , You can also write the modified content into xml file .

Python of use xml.dom.minidom Parsing xml file , Examples are as follows :

example

#!/usr/bin/python3

from xml.dom.minidom import parse
import xml.dom.minidom

# Use minidom The parser opens XML file
DOMTree = xml.dom.minidom.parse("movies.xml")
collection = DOMTree.documentElement
if collection.hasAttribute("shelf"):
   print ("Root element : %s" % collection.getAttribute("shelf"))

# Get all the movies in the collection
movies = collection.getElementsByTagName("movie")

# Print details of each movie
for movie in movies:
   print ("*****Movie*****")
   if movie.hasAttribute("title"):
      print ("Title: %s" % movie.getAttribute("title"))

   type = movie.getElementsByTagName('type')[0]
   print ("Type: %s" % type.childNodes[0].data)
   format = movie.getElementsByTagName('format')[0]
   print ("Format: %s" % format.childNodes[0].data)
   rating = movie.getElementsByTagName('rating')[0]
   print ("Rating: %s" % rating.childNodes[0].data)
   description = movie.getElementsByTagName('description')[0]
   print ("Description: %s" % description.childNodes[0].data)

The results of the above procedures are as follows :

Root element : New Arrivals
*****Movie*****
Title: Enemy Behind
Type: War, Thriller
Format: DVD
Rating: PG
Description: Talk about a US-Japan war
*****Movie*****
Title: Transformers
Type: Anime, Science Fiction
Format: DVD
Rating: R
Description: A schientific fiction
*****Movie*****
Title: Trigun
Type: Anime, Action
Format: DVD
Rating: PG
Description: Vash the Stampede!
*****Movie*****
Title: Ishtar
Type: Comedy
Format: VHS
Rating: PG
Description: Viewable boredom

  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved