org.jsoup.select.Selector

CSS-like element selector, that finds elements matching a query.

Selector syntax

A selector is a chain of simple selectors, separated by combinators. Selectors are case insensitive (including against elements, attributes, and attribute values).

The universal selector (*) is implicit when no element selector is supplied (i.e. *.header and .header is equivalent).

Pattern Matches Example
* any element *
tag elements with the given tag name div
ns|E elements of type E in the namespace ns fb|name finds <fb:name>elements
#id elements with attribute ID of "id" div#wrap#logo
.class elements with a class name of "class" div.left.result
[attr] elements with an attribute named "attr" (with any value) a[href][title]
[^attrPrefix] elements with an attribute name starting with "attrPrefix". Use to find elements with HTML5 datasets [^data-]div[^data-]
[attr=val] elements with an attribute named "attr", and value equal to "val" img[width=500],a[rel=nofollow]
[attr="val"] elements with an attribute named "attr", and value equal to "val" span[hello="Cleveland"][goodbye="Columbus"],a[rel="nofollow"]
[attr^=valPrefix] elements with an attribute named "attr", and value starting with "valPrefix" a[href^=http:]
[attr$=valSuffix] elements with an attribute named "attr", and value ending with "valSuffix" img[src$=.png]
[attr*=valContaining] elements with an attribute named "attr", and value containing "valContaining" a[href*=/search/]
[attr~=regex] elements with an attribute named "attr", and value matching the regular expression img[src~=(?i)\\.(png|jpe?g)]
  The above may be combined in any order div.header[title]
 

Combinators

E F an F element descended from an E element div a.logo h1
E > F an F direct child of E ol > li
E + F an F element immediately preceded by sibling E li + lidiv.head + div
E ~ F an F element preceded by sibling E h1 ~ p
E, F, G all matching elements E, F, or G a[href], div, h3
 

Pseudo selectors

:lt(n) elements whose sibling index is less than n td:lt(3) finds the first 3 cells of each row
:gt(n) elements whose sibling index is greater thann td:gt(1) finds cells after skipping the first two
:eq(n) elements whose sibling index is equal to n td:eq(0) finds the first cell of each row
:has(selector) elements that contains at least one element matching the selector div:has(p) finds divs that contain p elements
:not(selector) elements that do not match the selector. See also Elements.not(String) div:not(.logo) finds all divs that do not have the "logo" class.

div:not(:has(div)) finds divs that do not contain divs.

:contains(text) elements that contains the specified text. The search is case insensitive. The text may appear in the found element, or any of its descendants. p:contains(jsoup) finds p elements containing the text "jsoup".
:matches(regex) elements whose text matches the specified regular expression. The text may appear in the found element, or any of its descendants. td:matches(\\d+) finds table cells containing digits.div:matches((?i)login) finds divs containing the text, case insensitively.
:containsOwn(text) elements that directly contain the specified text. The search is case insensitive. The text must appear in the found element, not any of its descendants. p:containsOwn(jsoup) finds p elements with own text "jsoup".
:matchesOwn(regex) elements whose own text matches the specified regular expression. The text must appear in the found element, not any of its descendants. td:matchesOwn(\\d+) finds table cells directly containing digits. div:matchesOwn((?i)login) finds divs containing the text, case insensitively.
  The above may be combined in any order and with other selectors .light:contains(name):eq(0)

Structural pseudo selectors

:root The element that is the root of the document. In HTML, this is the html element :root
:nth-child(an+b)

elements that have an+b-1 siblings before it in the document tree, for any positive integer or zero value of n, and has a parent element. For values of a and b greater than zero, this effectively divides the element's children into groups of a elements (the last group taking the remainder), and selecting the bth element of each group. For example, this allows the selectors to address every other row in a table, and could be used to alternate the color of paragraph text in a cycle of four. The a andb values must be integers (positive, negative, or zero). The index of the first child of an element is 1.

In addition to this, :nth-child() can takeodd and even as arguments instead. odd has the same signification as 2n+1, and even has the same signification as 2n.

tr:nth-child(2n+1) finds every odd row of a table. :nth-child(10n-1) the 9th, 19th, 29th, etc, element. li:nth-child(5) the 5h li
:nth-last-child(an+b) elements that have an+b-1 siblings after it in the document tree. Otherwise like :nth-child() tr:nth-last-child(-n+2) the last two rows of a table
:nth-of-type(an+b) pseudo-class notation represents an element that has an+b-1 siblings with the same expanded element name before it in the document tree, for any zero or positive integer value of n, and has a parent element img:nth-of-type(2n+1)
:nth-last-of-type(an+b) pseudo-class notation represents an element that has an+b-1 siblings with the same expanded element name after it in the document tree, for any zero or positive integer value of n, and has a parent element img:nth-last-of-type(2n+1)
:first-child elements that are the first child of some other element. div > p:first-child
:last-child elements that are the last child of some other element. ol > li:last-child
:first-of-type elements that are the first sibling of its type in the list of children of its parent element dl dt:first-of-type
:last-of-type elements that are the last sibling of its type in the list of children of its parent element tr > td:last-of-type
:only-child elements that have a parent element and whose parent element hasve no other element children  
:only-of-type an element that has a parent element and whose parent element has no other element children with the same expanded element name  
:empty elements that have no children at all  
Author:
Jonathan Hedley, jonathan@hedley.net
See Also:
Element.select(String)

org.jsoup.select.Selector的更多相关文章

  1. jsoup select 选择器

    转载自:http://blog.csdn.net/zhejingyuan/article/details/11801027 方法 利用方法:Element.select(String selector ...

  2. jsoup中selector的用法及作用

    1.jsoup——selector定义: selector选择器是用于对jsoup解析后document文档的数据筛选操作 2.jsoup——selector操作步骤: 1)先导jsoup架包 2)基 ...

  3. jsoup select 选择器(Day_02)

    "自己"这个东西是看不见的,撞上一些别的什么,反弹回来,才会了解"自己". 所以,跟很强的东西.可怕的东西.水准很高的东西相碰撞,然后才知道"自己&q ...

  4. java中json解析,xml解析

    抓取网页内容,会返回json或者xml(html)格式的数据. 为了方便的对上述两种格式的数据进行解析,可采用解析工具. JsonPath https://github.com/jayway/Json ...

  5. java 爬虫:开源java爬虫 swing工具 Imgraber

    1实现点: 1.返回给定URL网页内,所有图像url list 2.返回给定URL网页内,自动生成图像文件路径.txt 文件 3.返回给定URL网页内,下载txt文件指定的图片url,并将所有图像保存 ...

  6. Jsoup代码解读之五-实现一个CSS Selector

    Jsoup代码解读之七-实现一个CSS Selector 当当当!终于来到了Jsoup的特色:CSS Selector部分.selector也是我写的爬虫框架webmagic开发的一个重点.附上一张s ...

  7. java爬取网页内容 简单例子(2)——附jsoup的select用法详解

    [背景] 在上一篇博文java爬取网页内容 简单例子(1)——使用正则表达式 里面,介绍了如何使用正则表达式去解析网页的内容,虽然该正则表达式比较通用,但繁琐,代码量多,现实中想要想出一条简单的正则表 ...

  8. jsoup获取文档类示例

    import java.io.IOException; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsou ...

  9. Jsoup解析Html中文文档

    jsoup 简介Java 程序在解析 HTML 文档时,相信大家都接触过 htmlparser 这个开源项目,我曾经在 IBM DW 上发表过两篇关于 htmlparser 的文章,分别是:从 HTM ...

随机推荐

  1. JavaScript的原型继承

    JavaScript是一门面向对象的语言.在JavaScript中有一句很经典的话,万物皆对象.既然是面向对象的,那就有面向对象的三大特征:封装.继承.多态.这里讲的是JavaScript的继承,其他 ...

  2. faith的23堂课:培养良好的工作方法与做事风格

    目标:通过每天一点的学习和实践,逐步形成好的做事风格和工作生活习惯. 方式:每天教一点,实践一点. 第一课 计划与总结,工作日志,戴明环 第二课 目的性:搞清楚,你每个行为的目的 第三课 目标管理,调 ...

  3. D3D 练习小框架

    自己练习D3D 程序搭的小框架,记录在这里,将来看到好回顾自己独自摸索的苦逼样子. #pragma once #pragma comment(lib,"d3d9.lib") #pr ...

  4. Swift - 项目部署配置(支持的系统,设备和状态条样式等)

    点击项目,在General选项卡中的“Deployment Info”栏目中可以进行一些项目的配置 Deployment Target:支持的iOS SDK的最低版本 Device:所支持的设备(iP ...

  5. Oracle rank和dense_rank排名函数

    1.rank函数 rank计算一组值的排名,返回数字类型.排名可能是不连续.如果有5人,其中有2个人排名第一,则rank返回的排名结果为:1 1 3 4 5. 作为一个聚合函数,返回虚拟行在样表中的排 ...

  6. 推荐一款功能强大的js 在线编辑器

    http://jszi.cn/public/oherub/11/edit

  7. 14.5.7 Storing InnoDB Undo Logs in Separate Tablespaces 存储InnoDB Undo logs 到单独的表空间

    14.5.7 Storing InnoDB Undo Logs in Separate Tablespaces 存储InnoDB Undo logs 到单独的表空间 在MySQL 5.6.3,你可以存 ...

  8. Microsoft2013校园招聘笔试题及解答

    继续求拍砖!!!! 1. You are managing the database of a book publichser, you currently store the book orders ...

  9. oracle查询语句中case when的使用

    case when语句语法如下: case when  表达式  then valueA  else valueB  end; 具体使用如下: select    (case when a.colum ...

  10. Python3.2官方文档翻译--实例对象和方法对象

    6.3.3 实例对象 如今我们用实例对象做什么呢?实例对象唯一可用的操作就是属性引用.如今有两种合法的属性名称:数据属性和方法. 数据属性相当于smallTalk中的实例变量,C++中的数据成员.数据 ...