pyquery – PyQuery complete API

选择器基本支持jQuery用法

class pyquery.pyquery.PyQuery(*args**kwargs)

The main class

class Fn

Hook for defining custom function (like the jQuery.fn):

>>> fn = lambda: this.map(lambda i, el: PyQuery(this).outerHtml())
>>> PyQuery.fn.listOuterHtml = fn
>>> S = PyQuery(
... '<ol> <li>Coffee</li> <li>Tea</li> <li>Milk</li> </ol>')
>>> S('li').listOuterHtml()
['<li>Coffee</li>', '<li>Tea</li>', '<li>Milk</li>']
PyQuery.addClass(value)

Alias for add_class()

PyQuery.add_class(value)

Add a css class to elements:

>>> d = PyQuery('<div></div>')
>>> d.add_class('myclass')
[<div.myclass>]
>>> d.addClass('myclass')
[<div.myclass>]
PyQuery.after(value)

add value after nodes

PyQuery.append(value)

append value to each nodes

PyQuery.appendTo(value)

Alias for append_to()

PyQuery.append_to(value)

append nodes to value

PyQuery.base_url

Return the url of current html document or None if not available.

PyQuery.before(value)

insert value before nodes

PyQuery.children(selector=None)

Filter elements that are direct children of self using optional selector:

>>> d = PyQuery('<span><p class="hello">Hi</p><p>Bye</p></span>')
>>> d
[<span>]
>>> d.children()
[<p.hello>, <p>]
>>> d.children('.hello')
[<p.hello>]
PyQuery.clone()

return a copy of nodes

PyQuery.closest(selector=None)
>>> d = PyQuery(
... '<div class="hello"><p>This is a '
... '<strong class="hello">test</strong></p></div>')
>>> d('strong').closest('div')
[<div.hello>]
>>> d('strong').closest('.hello')
[<strong.hello>]
>>> d('strong').closest('form')
[]
PyQuery.contents()

Return contents (with text nodes):

>>> d = PyQuery('hello <b>bold</b>')
>>> d.contents()
['hello ', <Element b at ...>]
PyQuery.each(func)

apply func on each nodes

PyQuery.empty()

remove nodes content

PyQuery.encoding

return the xml encoding of the root element

PyQuery.end()

Break out of a level of traversal and return to the parent level.

>>> m = '<p><span><em>Whoah!</em></span></p><p><em> there</em></p>'
>>> d = PyQuery(m)
>>> d('p').eq(1).find('em').end().end()
[<p>, <p>]
PyQuery.eq(index)

Return PyQuery of only the element with the provided index:

>>> d = PyQuery('<p class="hello">Hi</p><p>Bye</p><div></div>')
>>> d('p').eq(0)
[<p.hello>]
>>> d('p').eq(1)
[<p>]
>>> d('p').eq(2)
[]
PyQuery.extend(other)

Extend with anoter PyQuery object

PyQuery.filter(selector)

Filter elements in self using selector (string or function):

>>> d = PyQuery('<p class="hello">Hi</p><p>Bye</p>')
>>> d('p')
[<p.hello>, <p>]
>>> d('p').filter('.hello')
[<p.hello>]
>>> d('p').filter(lambda i: i == 1)
[<p>]
>>> d('p').filter(lambda i: PyQuery(this).text() == 'Hi')
[<p.hello>]
>>> d('p').filter(lambda i, this: PyQuery(this).text() == 'Hi')
[<p.hello>]
PyQuery.find(selector)

Find elements using selector traversing down from self:

>>> m = '<p><span><em>Whoah!</em></span></p><p><em> there</em></p>'
>>> d = PyQuery(m)
>>> d('p').find('em')
[<em>, <em>]
>>> d('p').eq(1).find('em')
[<em>]
PyQuery.hasClass(name)

Alias for has_class()

PyQuery.has_class(name)

Return True if element has class:

>>> d = PyQuery('<div class="myclass"></div>')
>>> d.has_class('myclass')
True
>>> d.hasClass('myclass')
True
PyQuery.height(value=<NoDefault>)

set/get height of element

PyQuery.hide()

remove display:none to elements style

>>> print(PyQuery('<div style="display:none;"/>').hide())
<div style="display: none"/>
PyQuery.html(value=<NoDefault>**kwargs)

Get or set the html representation of sub nodes.

Get the text value:

>>> d = PyQuery('<div><span>toto</span></div>')
>>> print(d.html())
<span>toto</span>

Extra args are passed to lxml.etree.tostring:

>>> d = PyQuery('<div><span></span></div>')
>>> print(d.html())
<span/>
>>> print(d.html(method='html'))
<span></span>

Set the text value:

>>> d.html('<span>Youhou !</span>')
[<div>]
>>> print(d)
<div><span>Youhou !</span></div>
PyQuery.insertAfter(value)

Alias for insert_after()

PyQuery.insertBefore(value)

Alias for insert_before()

PyQuery.insert_after(value)

insert nodes after value

PyQuery.insert_before(value)

insert nodes before value

PyQuery.is_(selector)

Returns True if selector matches at least one current element, else False:

>>> d = PyQuery('<p class="hello"><span>Hi</span></p><p>Bye</p>')
>>> d('p').eq(0).is_('.hello')
True
>>> d('p').eq(0).is_('span')
False
>>> d('p').eq(1).is_('.hello')
False
PyQuery.items(selector=None)

Iter over elements. Return PyQuery objects:

>>> d = PyQuery('<div><span>foo</span><span>bar</span></div>')
>>> [i.text() for i in d.items('span')]
['foo', 'bar']
>>> [i.text() for i in d('span').items()]
['foo', 'bar']
>>> list(d.items('a')) == list(d('a').items())
True
PyQuery.make_links_absolute(base_url=None)

Make all links absolute.

PyQuery.map(func)

Returns a new PyQuery after transforming current items with func.

func should take two arguments - ‘index’ and ‘element’. Elements can also be referred to as ‘this’ inside of func:

>>> d = PyQuery('<p class="hello">Hi there</p><p>Bye</p><br />')
>>> d('p').map(lambda i, e: PyQuery(e).text())
['Hi there', 'Bye'] >>> d('p').map(lambda i, e: len(PyQuery(this).text()))
[8, 3] >>> d('p').map(lambda i, e: PyQuery(this).text().split())
['Hi', 'there', 'Bye']
PyQuery.nextAll(selector=None)

Alias for next_all()

PyQuery.next_all(selector=None)
>>> h = '<span><p class="hello">Hi</p><p>Bye</p><img scr=""/></span>'
>>> d = PyQuery(h)
>>> d('p:last').next_all()
[<img>]
>>> d('p:last').nextAll()
[<img>]
PyQuery.not_(selector)

Return elements that don’t match the given selector:

>>> d = PyQuery('<p class="hello">Hi</p><p>Bye</p><div></div>')
>>> d('p').not_('.hello')
[<p>]
PyQuery.outerHtml()

Alias for outer_html()

PyQuery.outer_html()

Get the html representation of the first selected element:

>>> d = PyQuery('<div><span class="red">toto</span> rocks</div>')
>>> print(d('span'))
<span class="red">toto</span> rocks
>>> print(d('span').outer_html())
<span class="red">toto</span>
>>> print(d('span').outerHtml())
<span class="red">toto</span> >>> S = PyQuery('<p>Only <b>me</b> & myself</p>')
>>> print(S('b').outer_html())
<b>me</b>
PyQuery.parents(selector=None)
>>> d = PyQuery('<span><p class="hello">Hi</p><p>Bye</p></span>')
>>> d('p').parents()
[<span>]
>>> d('.hello').parents('span')
[<span>]
>>> d('.hello').parents('p')
[]
PyQuery.prepend(value)

prepend value to nodes

PyQuery.prependTo(value)

Alias for prepend_to()

PyQuery.prepend_to(value)

prepend nodes to value

PyQuery.prevAll(selector=None)

Alias for prev_all()

PyQuery.prev_all(selector=None)
>>> h = '<span><p class="hello">Hi</p><p>Bye</p><img scr=""/></span>'
>>> d = PyQuery(h)
>>> d('p:last').prev_all()
[<p.hello>]
>>> d('p:last').prevAll()
[<p.hello>]
PyQuery.remove(expr=<NoDefault>)

Remove nodes:

>>> h = '<div>Maybe <em>she</em> does <strong>NOT</strong> know</div>'
>>> d = PyQuery(h)
>>> d('strong').remove()
[<strong>]
>>> print(d)
<div>Maybe <em>she</em> does know</div>
PyQuery.removeAttr(name)

Alias for remove_attr()

PyQuery.removeClass(value)

Alias for remove_class()

PyQuery.remove_attr(name)

Remove an attribute:

>>> d = PyQuery('<div id="myid"></div>')
>>> d.remove_attr('id')
[<div>]
>>> d.removeAttr('id')
[<div>]
PyQuery.remove_class(value)

Remove a css class to elements:

>>> d = PyQuery('<div class="myclass"></div>')
>>> d.remove_class('myclass')
[<div>]
>>> d.removeClass('myclass')
[<div>]
PyQuery.remove_namespaces()

Remove all namespaces:

>>> doc = PyQuery('<foo xmlns="http://example.com/foo"></foo>')
>>> doc
[<{http://example.com/foo}foo>]
>>> doc.remove_namespaces()
[<foo>]
PyQuery.replaceAll(expr)

Alias for replace_all()

PyQuery.replaceWith(value)

Alias for replace_with()

PyQuery.replace_all(expr)

replace nodes by expr

PyQuery.replace_with(value)

replace nodes by value:

>>> doc = PyQuery("<html><div /></html>")
>>> node = PyQuery("<span />")
>>> child = doc.find('div')
>>> child.replace_with(node)

[<div>] >>> print(doc) <html><span/></html>

PyQuery.root

return the xml root element

PyQuery.show()

add display:block to elements style

>>> print(PyQuery('<div />').show())
<div style="display: block"/>
PyQuery.siblings(selector=None)
>>> h = '<span><p class="hello">Hi</p><p>Bye</p><img scr=""/></span>'
>>> d = PyQuery(h)
>>> d('.hello').siblings()
[<p>, <img>]
>>> d('.hello').siblings('img')
[<img>]
PyQuery.text(value=<NoDefault>)

Get or set the text representation of sub nodes.

Get the text value:

>>> doc = PyQuery('<div><span>toto</span><span>tata</span></div>')
>>> print(doc.text())
toto tata

Set the text value:

>>> doc.text('Youhou !')
[<div>]
>>> print(doc)
<div>Youhou !</div>
PyQuery.toggleClass(value)

Alias for toggle_class()

PyQuery.toggle_class(value)

Toggle a css class to elements

>>> d = PyQuery('<div></div>')
>>> d.toggle_class('myclass')
[<div.myclass>]
>>> d.toggleClass('myclass')
[<div>]
PyQuery.val(value=<NoDefault>)

Set the attribute value:

>>> d = PyQuery('<input />')
>>> d.val('Youhou')
[<input>]

Get the attribute value:

>>> d.val()
'Youhou'
PyQuery.width(value=<NoDefault>)

set/get width of element

PyQuery.wrap(value)

A string of HTML that will be created on the fly and wrapped around each target:

>>> d = PyQuery('<span>youhou</span>')
>>> d.wrap('<div></div>')
[<div>]
>>> print(d)
<div><span>youhou</span></div>
PyQuery.wrapAll(value)

Alias for wrap_all()

PyQuery.wrap_all(value)

Wrap all the elements in the matched set into a single wrapper element:

>>> d = PyQuery('<div><span>Hey</span><span>you !</span></div>')
>>> print(d('span').wrap_all('<div id="wrapper"></div>'))
<div id="wrapper"><span>Hey</span><span>you !</span></div> >>> d = PyQuery('<div><span>Hey</span><span>you !</span></div>')
>>> print(d('span').wrapAll('<div id="wrapper"></div>'))
<div id="wrapper"><span>Hey</span><span>you !</span></div>
PyQuery.xhtml_to_html()

Remove xhtml namespace:

>>> doc = PyQuery(
... '<html xmlns="http://www.w3.org/1999/xhtml"></html>')
>>> doc
[<{http://www.w3.org/1999/xhtml}html>]
>>> doc.xhtml_to_html()
[<html>] 项目中的使用
 
str(PQ)和PQ.outer_html()都会将部分便签由<tag></tag>变为<tag />,神奇的是有些标签变为这样形式后,
这样的字符串浏览器会解析不出来。'<!DOCTYPE html>'字符串会在变为PQ对象后自动剔除了。

pyQuery的更多相关文章

  1. pyquery的问题

    在使用pyquery时发现一些问题, 1.爬取的html中如果有较多的错误时,不能很好的补全. 2.如果要获取某个class中的内容时,如果内容太多不能取完整!只能取一部分. 这个在现在的最新版本中还 ...

  2. python爬虫神器PyQuery的使用方法

    你是否觉得 XPath 的用法多少有点晦涩难记呢? 你是否觉得 BeautifulSoup 的语法多少有些悭吝难懂呢? 你是否甚至还在苦苦研究正则表达式却因为少些了一个点而抓狂呢? 你是否已经有了一些 ...

  3. windows下python安装pyquery

    安装pyquery之前首先要明确一点,easyinstall 是一款python包管理器,类似于node的npm,用于安装python的扩展包,它安装的包是以*.egg的方式. 要安装pq需要经历以下 ...

  4. Python开发包推荐系列之xml、html解析器PyQuery

    使用python,喜欢她的简洁是一方面,另外就是它有着丰富的开发包 好用又方便 接下来会给大家推荐一系列很赞的开发包. 在解析html.xml过程中,我们有不少的包可以用.比如bs.lxml.xmlt ...

  5. python - PyQuery

    偶尔的机会,知道这么个扩展,手贱翻了下文档,发现似乎挺有意思,遂记录一二. what: 这是一个python版本的jquery,而且是后端执行的,至少官方是这么说的: pyquery allows y ...

  6. 【pyQuery】抓取startup news首页

    #! /usr/bin/python # coding: utf-8 from pyquery import PyQuery c=PyQuery('http://news.dbanotes.net/' ...

  7. 【pyQuery分析实例】分析体育网冠军联盟比赛成绩

    目标地址:http://www.espncricinfo.com/champions-league-twenty20-2012/engine/match/574265.html liz@nb-liz: ...

  8. 【PyQuery】PyQuery总结

    pyquery库是jQuery的Python实现,可以用于解析HTML网页内容, 官方文档地址是:http://packages.python.org/pyquery/. 二.使用方法 ? 1 fro ...

  9. win7下python安装pyquery

    安装pyquery之前首先要明确一点,easyinstall 是一款python包管理器,类似于node的npm,用于安装python的扩展包,它安装的包是以*.egg的方式. 要安装pq需要经历以下 ...

  10. Python抓取页面中超链接(URL)的三中方法比较(HTMLParser、pyquery、正则表达式) <转>

    Python抓取页面中超链接(URL)的3中方法比较(HTMLParser.pyquery.正则表达式) HTMLParser版: #!/usr/bin/python # -*- coding: UT ...

随机推荐

  1. hdu 1228 A+B 字符串处理 超级大水题

    中文意思不解释. 很水,我本来想用switch处理字符串,然后编译不通过...原来switch只能处理整数型的啊,我都忘了. 然后就有了很挫的一大串if代码了... 代码: #include < ...

  2. Oracle 经典语法(三)

    1. 让SELECT TO_CHAR(sal,'L99,999.99') FROM emp WHERE  ROWNUM < 5 输出结果的货币单位是¥和$.SELECT TO_CHAR(sal, ...

  3. 重构12-Break Dependencies(打破依赖)

    有些单元测试需要恰当的测试“缝隙”(test seam)来模拟/隔离一些不想被测试的部分.如果你正想在代码中引入这种单元测试,那么今天介绍的重构就十分有用.在这个例子中,我们的客户端代码使用一个静态类 ...

  4. VMware系统运维(六)VMware vSphere Web Client安装

    1.开始安装VMware vSphere Web Client 2.下一步 3.接受协议,下一步,大哥求你了,下次直接将这个默认下一步吧,嘿嘿 4.选择安装位置,下一步 5.配置端口号,默认9090和 ...

  5. 写过的HTML标签(一)

    HTML >   标题显示字体大小为<h1>. HTML 段落是通过标签 <p> 来定义的. HTML 链接是通过标签 <a> 来定义的.  实例: < ...

  6. # HTML && CSS 学习笔记

    https://www.zybuluo.com/denglongku/note/532786 1.Div左右居中 <div>1<div> div{ width:300px; h ...

  7. centos 6.5 安装lnmp(linux+nginx+mysql+php)

    参考:http://www.cnblogs.com/AloneSword/archive/2013/03/18/2966750.html (总结并简要) 一安装cmake wget -c http:/ ...

  8. 微信小程序个人理解

    1:小程序不是用HTML5开发,它是由微信全新定义的规范,是基于XML+JS的,不支持也不兼容HTML,兼容受限的部分CSS写法.(wxml) weixin markup language 2:小程序 ...

  9. Android Studio使用OpenCV后,使APP不安装OpenCV Manager即可运行

    转载自http://www.cnblogs.com/tail/p/4618790.html 采用静态初始化的方法,可以戳下边的链接查看官方的文档介绍 http://docs.opencv.org/do ...

  10. Part 53 to 55 Talking about Reflection in C#

    Part 53 Reflection in C# Part 54 Reflection Example here is the code private void btnDiscover_Click( ...