Python解析生成XML-ElementTree VS minidom

OS：Windows 7

关键字：Python3.4，XML，ElementTree，minidom

本文介绍用Python解析生成以下XML：

<Persons>

    <Person>

        <Name>LDL</Name>

        <Description Language='English'><![CDATA[cdata text]]></Description>

    </Person>

    <Person>

        <Name>China</Name>

        <Description Language='English'><![CDATA[cdata text]]></Description>

    </Person>

</Persons>

1.创建一个xml文件名为src.xml，内容如上，放到c:\temp

2.使用ElementTree读取src.xml，并创建一个内容相同的xml名为target-tree.xml。

ElementTreeSample.py如下：

# -*- coding: utf-8 -*-

"""

Sample of xml.etree.ElementTree

@author: ldlchina

"""

import os

import sys

import logging

import traceback

import xml.etree.ElementTree as ET

import time

def copy_node(src_node, target_node):

    # Copy attr

    for key in src_node.keys():

        target_node.set(key, src_node.get(key))

    if len(list(src_node)) > 0:

        for child in src_node:

            target_child = ET.Element(child.tag)

            target_node.append(target_child)

            copy_node(child, target_child)

    else:

        target_node.text = src_node.text

def read_write_xml(src, target):

    tree = ET.parse(src)

    root = tree.getroot()

    target_root = ET.Element(root.tag)

    start_time = time.time() * 1000

    copy_node(root, target_root)

    end_time = time.time() * 1000

    print('copy_node:' + str(end_time - start_time))

    target_tree = ET.ElementTree(target_root)

    target_tree.write(target)

    logging.info(target)

def main():

    try:

        current_file = os.path.realpath(__file__)

        # Configure logger

        log_file = current_file.replace('.py', '.log')

        logging.basicConfig(filename = log_file, filemode = 'w', level = logging.INFO)

        # Create console handler

        ch = logging.StreamHandler()

        ch.setLevel(logging.INFO)

        logger = logging.getLogger('')

        logger.addHandler(ch)

        #src = sys.argv[1]

        #target = sys.argv[2]

        # For debugging

        src = 'C:/temp/src.xml'

        target = 'C:/temp/target-tree.xml'

        # Generate results

        start_time = time.time() * 1000

        read_write_xml(src, target)

        end_time = time.time() * 1000

        print('read_write_xml:' + str(end_time - start_time))

    except:

        logging.exception(''.format(traceback.format_exc()))

    input('Press any key to exit...')

main()

3.使用minidom读取src.xml，并创建一个内容相同的xml名为target-dom.xml。

MinidomSample.py如下：

# -*- coding: utf-8 -*-

"""

Sample of xml.dom.minidom

@author: ldlchina

"""

import os

import sys

import logging

import traceback

import xml.dom.minidom as MD

import time

def get_text(n):

    nodelist = n.childNodes

    rc = ""

    for node in nodelist:

        if node.nodeType == node.TEXT_NODE or node.nodeType == node.CDATA_SECTION_NODE:

            rc = rc + node.data

    return rc

def copy_node(target_doc, src_node, target_node):

    if not isinstance(src_node, MD.Document) and src_node.hasAttributes():

        for item in src_node.attributes.items():

            target_node.setAttribute(item[0], item[1])

    for node in src_node.childNodes:

        if node.nodeType == node.TEXT_NODE:

            target_child = target_doc.createTextNode(node.nodeValue)

            target_node.appendChild(target_child)

        elif node.nodeType == node.CDATA_SECTION_NODE:

            target_child = target_doc.createCDATASection(node.nodeValue)

            target_node.appendChild(target_child)

        elif node.nodeType == node.ELEMENT_NODE:

            target_child = target_doc.createElement(node.nodeName)

            target_node.appendChild(target_child)

            copy_node(target_doc, node, target_child)

def read_write_xml(src, target):

    doc = MD.parse(src)

    target_doc = MD.Document()

    start_time = time.time() * 1000

    copy_node(target_doc, doc, target_doc)

    end_time = time.time() * 1000

    print('copy_node: ' + str(end_time - start_time))

    # Write to file

    f = open(target, 'w')

    f.write(target_doc.documentElement.toxml())

    f.close()

    logging.info(target)

def main():

    try:

        current_file = os.path.realpath(__file__)

        # Configure logger

        log_file = current_file.replace('.py', '.log')

        logging.basicConfig(filename = log_file, filemode = 'w', level = logging.INFO)

        # Create console handler

        ch = logging.StreamHandler()

        ch.setLevel(logging.INFO)

        logger = logging.getLogger('')

        logger.addHandler(ch)

        #src = sys.argv[1]

        #target = sys.argv[2]

        # For debugging

        src = 'C:/temp/src.xml'

        target = 'C:/temp/target-dom.xml'

        # Generate results

        start_time = time.time() * 1000

        read_write_xml(src, target)

        end_time = time.time() * 1000

        print('read_write_xml: ' + str(end_time - start_time))

    except:

        logging.exception(''.format(traceback.format_exc()))

    input('Press any key to exit...')

main()

4.运行ElementTreeSample.py，得到XML如下：

<Persons><Person><Name>LDL</Name><Description Language="English">cdata text</Description></Person><Person><Name>China</Name><Description Language="Chinese">cdata text</Description></Person></Persons>

5.运行MinidomSample.py，得到XML如下：

<Persons>

    <Person>

        <Name>LDL</Name>

        <Description Language="English"><![CDATA[cdata text]]></Description>

    </Person>

    <Person>

        <Name>China</Name>

        <Description Language="Chinese"><![CDATA[cdata text]]></Description>

    </Person>

</Persons>

ElementTree VS minidom：

1.ElementTree执行速度会比minidom快一些。

2.ElemenTree不能分析XML的转行和缩进。minidom可以。

3.ElemenTree不支持CDATA，minidom可以。

Python解析生成XML-ElementTree VS minidom的更多相关文章

python 批量生成xml标记文件(连通域坐标分割)
#!/usr/bin/python # -*- coding=utf-8 -*- # author : Manuel # date: 2019-05-15 from xml.etree import ...
xStream解析生成xml文件学习资料
参考链接: http://www.cnblogs.com/hoojo/archive/2011/04/22/2025197.html
Python 解析构建数据大杂烩 -- csv、xml、json、excel
Python 可以通过各种库去解析我们常见的数据.其中 csv 文件以纯文本形式存储表格数据,以某字符作为分隔值,通常为逗号:xml 可拓展标记语言,很像超文本标记语言 Html ,但主要对文档和数据 ...
python解析robot framework的output.xml，并生成html
一.背景 Jenkins自动构建RF脚本,生成的RF特有HTML报告不能正常打开. 需求:用Python解析测试报告的xml数据,放在普通HTML文件中打开二.output.xml数据三.用pyh ...
【Python】 xml解析与生成 xml
xml *之前用的时候也没想到..其实用BeautifulSoup就可以解析xml啊..因为html只是xml的一种实现方式吧.但是很蛋疼的一点就是,bs不提供获取对象的方法,其find大多获取的都是 ...
Python 解析 XML 文件生成 HTML
XML文件result.xml,内容如下: <ccm> <metric> <complexity>1</complexity> <unit> ...
python 解析与生成xml
xml.etree.ElementTree模块为xml文件的提取和建立提供了简单有效的API.下文中使用ET来代表xml.etree.ElementTree模块. XML是一种内在的分层的数据形式,展 ...
python xml文件解析及生成xml文件
#解析一个database的xml文件 """ <databaselist type="database config"> <dat ...
python XML文件解析：用ElementTree解析XML
Python标准库中,提供了ET的两种实现.一个是纯Python实现的xml.etree.ElementTree,另一个是速度更快的C语言实现xml.etree.cElementTree.请记住始终使 ...

随机推荐

flash builder 4.7 debug via usb device iPhone 4s - device not found
http://forums.adobe.com/message/4865192 Please provide more info on the above issue: 1.What is the m ...
第一章 01 namespace 命名空间
一.什么是namespace? namesapce是为了防止名字冲突提供的一种控制方式. 当一个程序需要用到很多的库文件的时候,名字冲突有时无法避免.之前的解决思路是使用更长的变量名字,使用不方便. ...
ios 图片截取功能图片拼接功能
截取整个view: -(UIImage*)captureView:(UIView *)theView{ CGRect rect = theView.frame; if ([theView isKind ...
关于automatic_Panoramic_Image_Stitching_using_Invariant_features 的阅读笔记
并没有都读完,不过感觉还是有必要做一个笔记的,毕竟这只是随笔不是文章,所以可以有多少写多少,也算是工作总结了,最重要的是这个好在可以,完成所有有意义文档的检索,比起自己的word来说高级很多~~~. ...
linux语言环境设置
查看linux的支持的语言集合执行locale命令 LANG=zh_CN.UTF-8 LANGUAGE=zh_CN:zh LC_CTYPE="zh_CN.UTF-8" LC_NU ...
LVM的添加与删除
#############################创建 fdisk -l查看分区情况 fdisk /dev/xvdb pvcreate /dev/xvdb1 vgextend VolGroup ...
Servlet线程安全问题
Servlet采用单实例多线程方式运行,因此是线程不安全的.默认情况下,非分布式系统,Servlet容器只会维护一个Servlet的实例,当多个请求到达同一个Servlet时,Servlet容器会启动 ...
centos vim 中文乱码解决方案
1.安装中文包:yum -y groupinstall chinese-support 2.修改字符编码配置文件 vi /etc/sysconfig/i18n LANGUAGE="zh_ ...
面试相关的技术问题--WEB基础
1. servlet生命周期.各个方法和工作原理servlet的生命周期包括三个阶段,分别是:初始化阶段:调用init()方法(整个生命周期内只被调用一次)响应客户端请求阶段:service()终止 ...
SpringMVC学习记录
1E)Spring MVC框架 ①Jar包结构: docs+libs+schema. 版本区别:核心包,源码包. SpringMVC文档学习: 学习三步骤: 1)是什么? 开源框架 2)做什么? IO ...

Python解析生成XML-ElementTree VS minidom

Python解析生成XML-ElementTree VS minidom的更多相关文章

随机推荐

热门专题