XPath 获取两个node中间的HTML Nodes

2015-06-01 16:42 972人阅读评论(0) 收藏举报

//div[@id="Recipe"]//h5[contains(text(),"Ingredients")]/following-sibling::p[count(.|//div[@id="Recipe"]//h5[contains(text(),"Method")]/preceding-sibling::p) = count(//div[@id="Recipe"]//h5[contains(text(),"Method")]/preceding-sibling::p)]

In XPath 1.0 one way to do this is by using the Kayessian method for node-set intersection:

$ns1[count(.|$ns2) = count($ns2)]

The above expression selects exactly the nodes that are part both of the node-set $ns1 and the node-set $ns2.

To apply this to the specific question -- let's say we need to select all nodes between the 2nd and 3rd h3 element in the following XML document:

<html>

  <h3>Title T31</h3>

    <a31/>

    <b31/>

  <h3>Title T32</h3>

    <a32/>

    <b32/>

  <h3>Title T33</h3>

    <a33/>

    <b33/>

  <h3>Title T34</h3>

    <a34/>

    <b34/>

  <h3>Title T35</h3>

</html>

We have to substitute $ns1 with:

/*/h3[2]/following-sibling::node()

and to substitute $ns2 with:

/*/h3[3]/preceding-sibling::node()

Thus, the complete XPath expression is:

/*/h3[2]/following-sibling::node()

             [count(.|/*/h3[3]/preceding-sibling::node())

             =

              count(/*/h3[3]/preceding-sibling::node())

             ]

We can verify that this is the correct XPath expression:

<xsl:stylesheet version="1.0"

 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">

  <xsl:copy-of select=

   "/*/h3[2]/following-sibling::node()

             [count(.|/*/h3[3]/preceding-sibling::node())

             =

              count(/*/h3[3]/preceding-sibling::node())

             ]

   "/>

 </xsl:template>

</xsl:stylesheet>

When this transformation is applied on the XML document presented above, the wanted, correct result is produced:

<a32/>

<b32/>

II. XPath 2.0 solution:

Use the intersect operator:

   /*/h3[2]/following-sibling::node()

intersect

   /*/h3[3]/preceding-sibling::node()

XPath 获取两个node中间的HTML Nodes的更多相关文章

爬虫 xpath 获取方式
回顾 bs4 实例化bs对象,将页面源码数据加载到该对象中定位标签:find('name',class_='xxx') findall() select() 将标签中的文本内容获取 string t ...
Appium根据xpath获取控件
如文章< Appium基于安卓的各种FindElement的控件定位方法实践>所述,Appium拥有众多获取控件的方法.其中一种就是根据控件所在页面的XPATH来定位控件. 本文就是尝试通 ...
Appium依据xpath获取控件实例随笔
如文章<Appium基于安卓的各种FindElement的控件定位方法实践>所述,Appium拥有众多获取控件的方法.当中一种就是依据控件所在页面的XPATH来定位控件. 本文就是尝试通过 ...
【转】Appium根据xpath获取控件实例随笔
原文地址:http://blog.csdn.net/zhubaitian/article/details/39754233 如文章<Appium基于安卓的各种FindElement的控件定位方法 ...
Appium根据xpath获取控件实例随笔
如文章<Appium基于安卓的各种FindElement的控件定位方法实践>所述,Appium拥有众多获取控件的方法.其中一种就是根据控件所在页面的XPATH来定位控件. 本文就是尝试通过 ...
使用python+xpath 获取https://pypi.python.org/pypi/lxml/2.3/的下载链接
使用python+xpath 获取https://pypi.python.org/pypi/lxml/2.3/的下载链接: 使用requests获取html后,分析html中的标签发现所需要的链接在& ...
Java 获取两个日期之间的日期
1.前期需求,两个日期,我们叫他startDate和endDate,然后获取到两个日期之间的日期 /** * 获取两个日期之间的日期 * @param start 开始日期 * @param end ...
xpath获取下一页,兄弟结点的妙用
第一页的情况: 第四页的情况 : 文章的链接: http://tech.huanqiu.com/science/2018-02/11605853_4.html 从上面我们可以看到,如果仅仅用xpat ...
JavaScript实现获取两个排序数组的中位数算法示例
本文实例讲述了JavaScript排序代码实现获取两个排序数组的中位数算法.分享给大家供大家参考,具体如下: 题目给定两个大小为 m 和 n 的有序数组 nums1 和 nums2 . 请找出这两个 ...

随机推荐

hibernate4注解字段为mysql的text
文章的正文detail就需要设置为text 在getter方法上添加注解 @Lob @Basic(fetch = FetchType.LAZY) @Type(type = "text&quo ...
[code] if (x<0)x=0;else if (x>255)x=255;
//颜色范围0-255: // 1.原始: )tem_b=;)tem_b=; )tem_g=;)tem_g=; )tem_r=;)tem_r=; //2.使用条件状态值生成掩码来移除条件分支 tem_ ...
JavaWeb-类加载器-注解-动态代理
(一)类加载器 1．什么是类加载器,作用是什么? 类加载器就加载字节码文件(.class) 2．类加载器的种类类加载器有三种,不同类加载器加载不同的 1)BootStrap:引导类加载器:加载都是最 ...
SQL Server分页查询进化史
分页查询一直SQL Server的一个硬伤,就是是经过一些进化,比起MySql的limit还是有一些差距. 一.条件过滤(适应用所有版本) 条件过滤的方法有很多,而思路就是利用集合的差集选择出目标集合 ...
#socket #socketserver
#通过socket 实现简单的ssh#服务端 #服务端 import os import socket server = socket.socket() #server.bind(('0.0.0.0' ...
php版本选择
对应环境,选择对应的php包 apache环境:VC6.TS(thread safe) IIS环境:VC9.NTS(non thread safe)
Yii 网站上线不需手动配置
参考: http://www.cnblogs.com/x3d/p/php_auto_prepend_file.html
angular依赖注入(2)——注入器的使用
一.显示注入器 injector = ReflectiveInjector.resolveAndCreate([Car, Engine, Tires]); let car = injector.get ...
web服务器--nginx简介
nginx 介绍Nginx (engine x) 是一个高性能的HTTP和反向代理web服务器,同时也提供了IMAP/POP3/SMTP服务.Nginx是一款轻量级的Web 服务器/反向代理服务器及电 ...
fill memset, for小测试
/*很无聊写着玩玩,后来发现memset效率会比fill高出这么多,可惜一般只用来赋值0,-1......以后可以用fill来偷偷懒了...*/ #include<iostream> #i ...

XPath 获取两个node中间的HTML Nodes

XPath 获取两个node中间的HTML Nodes

XPath 获取两个node中间的HTML Nodes的更多相关文章

随机推荐

热门专题