`pyquery` – PyQuery complete API

选择器基本支持jQuery用法

class pyquery.pyquery.PyQuery(*args, **kwargs)

The main class

class Fn

Hook for defining custom function (like the jQuery.fn):

>>> fn = lambda: this.map(lambda i, el: PyQuery(this).outerHtml())

>>> PyQuery.fn.listOuterHtml = fn

>>> S = PyQuery(

...   '<ol>   <li>Coffee</li>   <li>Tea</li>   <li>Milk</li>   </ol>')

>>> S('li').listOuterHtml()

['<li>Coffee</li>', '<li>Tea</li>', '<li>Milk</li>']

PyQuery.addClass(value): Alias for add_class()

PyQuery.add_class(value)

Add a css class to elements:

>>> d = PyQuery('<div></div>')

>>> d.add_class('myclass')

[<div.myclass>]

>>> d.addClass('myclass')

[<div.myclass>]

PyQuery.after(value): add value after nodes

PyQuery.append(value): append value to each nodes

PyQuery.appendTo(value): Alias for append_to()

PyQuery.append_to(value): append nodes to value

PyQuery.base_url: Return the url of current html document or None if not available.

PyQuery.before(value): insert value before nodes

PyQuery.children(selector=None)

Filter elements that are direct children of self using optional selector:

>>> d = PyQuery('<span><p class="hello">Hi</p><p>Bye</p></span>')

>>> d

[<span>]

>>> d.children()

[<p.hello>, <p>]

>>> d.children('.hello')

[<p.hello>]

PyQuery.clone(): return a copy of nodes

PyQuery.closest(selector=None)

>>> d = PyQuery(

...  '<div class="hello"><p>This is a '

...  '<strong class="hello">test</strong></p></div>')

>>> d('strong').closest('div')

[<div.hello>]

>>> d('strong').closest('.hello')

[<strong.hello>]

>>> d('strong').closest('form')

[]

PyQuery.contents()

Return contents (with text nodes):

>>> d = PyQuery('hello <b>bold</b>')

>>> d.contents()

['hello ', <Element b at ...>]

PyQuery.each(func): apply func on each nodes

PyQuery.empty(): remove nodes content

PyQuery.encoding: return the xml encoding of the root element

PyQuery.end()

Break out of a level of traversal and return to the parent level.

>>> m = '<p><span><em>Whoah!</em></span></p><p><em> there</em></p>'

>>> d = PyQuery(m)

>>> d('p').eq(1).find('em').end().end()

[<p>, <p>]

PyQuery.eq(index)

Return PyQuery of only the element with the provided index:

>>> d = PyQuery('<p class="hello">Hi</p><p>Bye</p><div></div>')

>>> d('p').eq(0)

[<p.hello>]

>>> d('p').eq(1)

[<p>]

>>> d('p').eq(2)

[]

PyQuery.extend(other): Extend with anoter PyQuery object

PyQuery.filter(selector)

Filter elements in self using selector (string or function):

>>> d = PyQuery('<p class="hello">Hi</p><p>Bye</p>')

>>> d('p')

[<p.hello>, <p>]

>>> d('p').filter('.hello')

[<p.hello>]

>>> d('p').filter(lambda i: i == 1)

[<p>]

>>> d('p').filter(lambda i: PyQuery(this).text() == 'Hi')

[<p.hello>]

>>> d('p').filter(lambda i, this: PyQuery(this).text() == 'Hi')

[<p.hello>]

PyQuery.find(selector)

Find elements using selector traversing down from self:

>>> m = '<p><span><em>Whoah!</em></span></p><p><em> there</em></p>'

>>> d = PyQuery(m)

>>> d('p').find('em')

[<em>, <em>]

>>> d('p').eq(1).find('em')

[<em>]

PyQuery.hasClass(name): Alias for has_class()

PyQuery.has_class(name)

Return True if element has class:

>>> d = PyQuery('<div class="myclass"></div>')

>>> d.has_class('myclass')

True

>>> d.hasClass('myclass')

True

PyQuery.height(value=<NoDefault>): set/get height of element

PyQuery.hide()

remove display:none to elements style

>>> print(PyQuery('<div style="display:none;"/>').hide())

<div style="display: none"/>

PyQuery.html(value=<NoDefault>, **kwargs)

Get or set the html representation of sub nodes.

Get the text value:

>>> d = PyQuery('<div><span>toto</span></div>')

>>> print(d.html())

<span>toto</span>

Extra args are passed to lxml.etree.tostring:

>>> d = PyQuery('<div><span></span></div>')

>>> print(d.html())

<span/>

>>> print(d.html(method='html'))

<span></span>

Set the text value:

>>> d.html('<span>Youhou !</span>')

[<div>]

>>> print(d)

<div><span>Youhou !</span></div>

PyQuery.insertAfter(value): Alias for insert_after()

PyQuery.insertBefore(value): Alias for insert_before()

PyQuery.insert_after(value): insert nodes after value

PyQuery.insert_before(value): insert nodes before value

PyQuery.is_(selector)

Returns True if selector matches at least one current element, else False:

>>> d = PyQuery('<p class="hello"><span>Hi</span></p><p>Bye</p>')

>>> d('p').eq(0).is_('.hello')

True

>>> d('p').eq(0).is_('span')

False

>>> d('p').eq(1).is_('.hello')

False

PyQuery.items(selector=None)

Iter over elements. Return PyQuery objects:

>>> d = PyQuery('<div><span>foo</span><span>bar</span></div>')

>>> [i.text() for i in d.items('span')]

['foo', 'bar']

>>> [i.text() for i in d('span').items()]

['foo', 'bar']

>>> list(d.items('a')) == list(d('a').items())

True

PyQuery.make_links_absolute(base_url=None): Make all links absolute.

PyQuery.map(func)

Returns a new PyQuery after transforming current items with func.

func should take two arguments - ‘index’ and ‘element’. Elements can also be referred to as ‘this’ inside of func:

>>> d = PyQuery('<p class="hello">Hi there</p><p>Bye</p><br />')

>>> d('p').map(lambda i, e: PyQuery(e).text())

['Hi there', 'Bye']

>>> d('p').map(lambda i, e: len(PyQuery(this).text()))

[8, 3]

>>> d('p').map(lambda i, e: PyQuery(this).text().split())

['Hi', 'there', 'Bye']

PyQuery.nextAll(selector=None): Alias for next_all()

PyQuery.next_all(selector=None)

>>> h = '<span><p class="hello">Hi</p><p>Bye</p><img scr=""/></span>'

>>> d = PyQuery(h)

>>> d('p:last').next_all()

[<img>]

>>> d('p:last').nextAll()

[<img>]

PyQuery.not_(selector)

Return elements that don’t match the given selector:

>>> d = PyQuery('<p class="hello">Hi</p><p>Bye</p><div></div>')

>>> d('p').not_('.hello')

[<p>]

PyQuery.outerHtml(): Alias for outer_html()

PyQuery.outer_html()

Get the html representation of the first selected element:

>>> d = PyQuery('<div><span class="red">toto</span> rocks</div>')

>>> print(d('span'))

<span class="red">toto</span> rocks

>>> print(d('span').outer_html())

<span class="red">toto</span>

>>> print(d('span').outerHtml())

<span class="red">toto</span>

>>> S = PyQuery('<p>Only <b>me</b> & myself</p>')

>>> print(S('b').outer_html())

<b>me</b>

PyQuery.parents(selector=None)

>>> d = PyQuery('<span><p class="hello">Hi</p><p>Bye</p></span>')

>>> d('p').parents()

[<span>]

>>> d('.hello').parents('span')

[<span>]

>>> d('.hello').parents('p')

[]

PyQuery.prepend(value): prepend value to nodes

PyQuery.prependTo(value): Alias for prepend_to()

PyQuery.prepend_to(value): prepend nodes to value

PyQuery.prevAll(selector=None): Alias for prev_all()

PyQuery.prev_all(selector=None)

>>> h = '<span><p class="hello">Hi</p><p>Bye</p><img scr=""/></span>'

>>> d = PyQuery(h)

>>> d('p:last').prev_all()

[<p.hello>]

>>> d('p:last').prevAll()

[<p.hello>]

PyQuery.remove(expr=<NoDefault>)

Remove nodes:

>>> h = '<div>Maybe <em>she</em> does <strong>NOT</strong> know</div>'

>>> d = PyQuery(h)

>>> d('strong').remove()

[<strong>]

>>> print(d)

<div>Maybe <em>she</em> does   know</div>

PyQuery.removeAttr(name): Alias for remove_attr()

PyQuery.removeClass(value): Alias for remove_class()

PyQuery.remove_attr(name)

Remove an attribute:

>>> d = PyQuery('<div id="myid"></div>')

>>> d.remove_attr('id')

[<div>]

>>> d.removeAttr('id')

[<div>]

PyQuery.remove_class(value)

Remove a css class to elements:

>>> d = PyQuery('<div class="myclass"></div>')

>>> d.remove_class('myclass')

[<div>]

>>> d.removeClass('myclass')

[<div>]

PyQuery.remove_namespaces()

Remove all namespaces:

>>> doc = PyQuery('<foo xmlns="http://example.com/foo"></foo>')

>>> doc

[<{http://example.com/foo}foo>]

>>> doc.remove_namespaces()

[<foo>]

PyQuery.replaceAll(expr): Alias for replace_all()

PyQuery.replaceWith(value): Alias for replace_with()

PyQuery.replace_all(expr): replace nodes by expr

PyQuery.replace_with(value)

replace nodes by value:

>>> doc = PyQuery("<html><div /></html>")

>>> node = PyQuery("<span />")

>>> child = doc.find('div')

>>> child.replace_with(node)

[<div>] >>> print(doc) <html><span/></html>

PyQuery.root: return the xml root element

PyQuery.show()

add display:block to elements style

>>> print(PyQuery('<div />').show())

<div style="display: block"/>

PyQuery.siblings(selector=None)

>>> h = '<span><p class="hello">Hi</p><p>Bye</p><img scr=""/></span>'

>>> d = PyQuery(h)

>>> d('.hello').siblings()

[<p>, <img>]

>>> d('.hello').siblings('img')

[<img>]

PyQuery.text(value=<NoDefault>)

Get or set the text representation of sub nodes.

Get the text value:

>>> doc = PyQuery('<div><span>toto</span><span>tata</span></div>')

>>> print(doc.text())

toto tata

Set the text value:

>>> doc.text('Youhou !')

[<div>]

>>> print(doc)

<div>Youhou !</div>

PyQuery.toggleClass(value): Alias for toggle_class()

PyQuery.toggle_class(value)

Toggle a css class to elements

>>> d = PyQuery('<div></div>')

>>> d.toggle_class('myclass')

[<div.myclass>]

>>> d.toggleClass('myclass')

[<div>]

PyQuery.val(value=<NoDefault>)

Set the attribute value:

>>> d = PyQuery('<input />')

>>> d.val('Youhou')

[<input>]

Get the attribute value:

>>> d.val()

'Youhou'

PyQuery.width(value=<NoDefault>): set/get width of element

PyQuery.wrap(value)

A string of HTML that will be created on the fly and wrapped around each target:

>>> d = PyQuery('<span>youhou</span>')

>>> d.wrap('<div></div>')

[<div>]

>>> print(d)

<div><span>youhou</span></div>

PyQuery.wrapAll(value): Alias for wrap_all()

PyQuery.wrap_all(value)

Wrap all the elements in the matched set into a single wrapper element:

>>> d = PyQuery('<div><span>Hey</span><span>you !</span></div>')

>>> print(d('span').wrap_all('<div id="wrapper"></div>'))

<div id="wrapper"><span>Hey</span><span>you !</span></div>

>>> d = PyQuery('<div><span>Hey</span><span>you !</span></div>')

>>> print(d('span').wrapAll('<div id="wrapper"></div>'))

<div id="wrapper"><span>Hey</span><span>you !</span></div>

PyQuery.xhtml_to_html()

Remove xhtml namespace:

>>> doc = PyQuery(

...         '<html xmlns="http://www.w3.org/1999/xhtml"></html>')

>>> doc

[<{http://www.w3.org/1999/xhtml}html>]

>>> doc.xhtml_to_html()

[<html>]

项目中的使用

str(PQ)和PQ.outer_html()都会将部分便签由<tag></tag>变为<tag />，神奇的是有些标签变为这样形式后，
这样的字符串浏览器会解析不出来。'<!DOCTYPE html>'字符串会在变为PQ对象后自动剔除了。

pyQuery的更多相关文章

pyquery的问题
在使用pyquery时发现一些问题, 1.爬取的html中如果有较多的错误时,不能很好的补全. 2.如果要获取某个class中的内容时,如果内容太多不能取完整!只能取一部分. 这个在现在的最新版本中还 ...
python爬虫神器PyQuery的使用方法
你是否觉得 XPath 的用法多少有点晦涩难记呢? 你是否觉得 BeautifulSoup 的语法多少有些悭吝难懂呢? 你是否甚至还在苦苦研究正则表达式却因为少些了一个点而抓狂呢? 你是否已经有了一些 ...
windows下python安装pyquery
安装pyquery之前首先要明确一点,easyinstall 是一款python包管理器,类似于node的npm,用于安装python的扩展包,它安装的包是以*.egg的方式. 要安装pq需要经历以下 ...
Python开发包推荐系列之xml、html解析器PyQuery
使用python,喜欢她的简洁是一方面,另外就是它有着丰富的开发包好用又方便接下来会给大家推荐一系列很赞的开发包. 在解析html.xml过程中,我们有不少的包可以用.比如bs.lxml.xmlt ...
python - PyQuery
偶尔的机会,知道这么个扩展,手贱翻了下文档,发现似乎挺有意思,遂记录一二. what: 这是一个python版本的jquery,而且是后端执行的,至少官方是这么说的: pyquery allows y ...
【pyQuery】抓取startup news首页
#! /usr/bin/python # coding: utf-8 from pyquery import PyQuery c=PyQuery('http://news.dbanotes.net/' ...
【pyQuery分析实例】分析体育网冠军联盟比赛成绩
目标地址:http://www.espncricinfo.com/champions-league-twenty20-2012/engine/match/574265.html liz@nb-liz: ...
【PyQuery】PyQuery总结
pyquery库是jQuery的Python实现,可以用于解析HTML网页内容, 官方文档地址是:http://packages.python.org/pyquery/. 二.使用方法 ? 1 fro ...
win7下python安装pyquery
安装pyquery之前首先要明确一点,easyinstall 是一款python包管理器,类似于node的npm,用于安装python的扩展包,它安装的包是以*.egg的方式. 要安装pq需要经历以下 ...
Python抓取页面中超链接(URL)的三中方法比较(HTMLParser、pyquery、正则表达式) <转>
Python抓取页面中超链接(URL)的3中方法比较(HTMLParser.pyquery.正则表达式) HTMLParser版: #!/usr/bin/python # -*- coding: UT ...

随机推荐

Spark installation for windows
download spark from spark.apache.org download hadoop from hadoop.apache.org download hadoop.dll and ...
iOS 6编程Cookbook(影印版)
<iOS 6编程Cookbook(影印版)> 基本信息原书名:iOS 6 Programming Cookbook 作者: Vandad Nahavandipoor 出版社:东南大学出版 ...
jquery 的 ajax 在非阻塞时返回 XMLHttpRequest
jquery 的 ajax 在非阻塞时返回是 [object XMLHttpRequest] 对象(firefox 下 alert(对象名) 也可以直接看到对象类型) 返回的内容用 reques ...
房间声学原理与Schroeder混响算法实现
一.混响时间的计算与预测所谓混响就是声音的直达声与反射声很紧凑的重合在一起时人耳所听到的声音,这个效果在语音的后期处理时特别有用.能产生混响最常见的场景就是房间内,尤其是空旷的房间中. 混响有直达声 ...
QTREE3 spoj 2798. Query on a tree again! 树链剖分+线段树
Query on a tree again! 给出一棵树,树节点的颜色初始时为白色,有两种操作: 0.把节点x的颜色置反(黑变白,白变黑). 1.询问节点1到节点x的路径上第一个黑色节点的编号. 分析 ...
android目录
2013-09-121.activity生命周期 activity生命周期2 widget http://blog.csdn.net/xiang_j2ee/article/details/727564 ...
Differential Geometry之第四章标架与曲面论的基本定理
第四章.标架与曲面论的基本定理 1.活动标架 2.自然标架的运动方程爱因斯坦求和约定(Einstein summation convention) 3.曲面的结构方程 4.曲面的存在唯一性定理 5. ...
修复浏览器不支持Array自带的indexOf方法的扩展
JavaScript中Array的indexOf方法支持的浏览器有:IE9+.Firefox 2+.Safari 3+.Opera 9.5+和Chrome 如果想要在不支持的浏览器中使用indexOf ...
html5 之 canvas 相关知识(二)API-fillStyle
颜色.样式和阴影 fillStyle 设置或返回用于填充绘画的颜色.渐变或模式定义和用法 context.fillStyle=color|gradient|pattern;//指示绘图填充色的CSS ...
Table of Contents - Ehcache
Ehcache 2.9.x API Developer Guide Key Classes and Methods Basic Caching Cache Usage Patterns Searchi ...

pyQuery

pyquery – PyQuery complete API

pyQuery的更多相关文章

随机推荐

热门专题

`pyquery` – PyQuery complete API