pyQuery
pyquery
– PyQuery complete API
选择器基本支持jQuery用法
- class
pyquery.pyquery.
PyQuery
(*args, **kwargs) -
The main class
- class
Fn
-
Hook for defining custom function (like the jQuery.fn):
>>> fn = lambda: this.map(lambda i, el: PyQuery(this).outerHtml())
>>> PyQuery.fn.listOuterHtml = fn
>>> S = PyQuery(
... '<ol> <li>Coffee</li> <li>Tea</li> <li>Milk</li> </ol>')
>>> S('li').listOuterHtml()
['<li>Coffee</li>', '<li>Tea</li>', '<li>Milk</li>']
PyQuery.
addClass
(value)-
Alias for
add_class()
PyQuery.
add_class
(value)-
Add a css class to elements:
>>> d = PyQuery('<div></div>')
>>> d.add_class('myclass')
[<div.myclass>]
>>> d.addClass('myclass')
[<div.myclass>]
PyQuery.
after
(value)-
add value after nodes
PyQuery.
append
(value)-
append value to each nodes
PyQuery.
appendTo
(value)-
Alias for
append_to()
PyQuery.
append_to
(value)-
append nodes to value
PyQuery.
base_url
-
Return the url of current html document or None if not available.
PyQuery.
before
(value)-
insert value before nodes
PyQuery.
children
(selector=None)-
Filter elements that are direct children of self using optional selector:
>>> d = PyQuery('<span><p class="hello">Hi</p><p>Bye</p></span>')
>>> d
[<span>]
>>> d.children()
[<p.hello>, <p>]
>>> d.children('.hello')
[<p.hello>]
PyQuery.
clone
()-
return a copy of nodes
PyQuery.
closest
(selector=None)-
>>> d = PyQuery(
... '<div class="hello"><p>This is a '
... '<strong class="hello">test</strong></p></div>')
>>> d('strong').closest('div')
[<div.hello>]
>>> d('strong').closest('.hello')
[<strong.hello>]
>>> d('strong').closest('form')
[]
PyQuery.
contents
()-
Return contents (with text nodes):
>>> d = PyQuery('hello <b>bold</b>')
>>> d.contents()
['hello ', <Element b at ...>]
PyQuery.
each
(func)-
apply func on each nodes
PyQuery.
empty
()-
remove nodes content
PyQuery.
encoding
-
return the xml encoding of the root element
PyQuery.
end
()-
Break out of a level of traversal and return to the parent level.
>>> m = '<p><span><em>Whoah!</em></span></p><p><em> there</em></p>'
>>> d = PyQuery(m)
>>> d('p').eq(1).find('em').end().end()
[<p>, <p>]
PyQuery.
eq
(index)-
Return PyQuery of only the element with the provided index:
>>> d = PyQuery('<p class="hello">Hi</p><p>Bye</p><div></div>')
>>> d('p').eq(0)
[<p.hello>]
>>> d('p').eq(1)
[<p>]
>>> d('p').eq(2)
[]
PyQuery.
extend
(other)-
Extend with anoter PyQuery object
PyQuery.
filter
(selector)-
Filter elements in self using selector (string or function):
>>> d = PyQuery('<p class="hello">Hi</p><p>Bye</p>')
>>> d('p')
[<p.hello>, <p>]
>>> d('p').filter('.hello')
[<p.hello>]
>>> d('p').filter(lambda i: i == 1)
[<p>]
>>> d('p').filter(lambda i: PyQuery(this).text() == 'Hi')
[<p.hello>]
>>> d('p').filter(lambda i, this: PyQuery(this).text() == 'Hi')
[<p.hello>]
PyQuery.
find
(selector)-
Find elements using selector traversing down from self:
>>> m = '<p><span><em>Whoah!</em></span></p><p><em> there</em></p>'
>>> d = PyQuery(m)
>>> d('p').find('em')
[<em>, <em>]
>>> d('p').eq(1).find('em')
[<em>]
PyQuery.
hasClass
(name)-
Alias for
has_class()
PyQuery.
has_class
(name)-
Return True if element has class:
>>> d = PyQuery('<div class="myclass"></div>')
>>> d.has_class('myclass')
True
>>> d.hasClass('myclass')
True
PyQuery.
height
(value=<NoDefault>)-
set/get height of element
PyQuery.
hide
()-
remove display:none to elements style
>>> print(PyQuery('<div style="display:none;"/>').hide())
<div style="display: none"/>
PyQuery.
html
(value=<NoDefault>, **kwargs)-
Get or set the html representation of sub nodes.
Get the text value:
>>> d = PyQuery('<div><span>toto</span></div>')
>>> print(d.html())
<span>toto</span>Extra args are passed to
lxml.etree.tostring
:>>> d = PyQuery('<div><span></span></div>')
>>> print(d.html())
<span/>
>>> print(d.html(method='html'))
<span></span>Set the text value:
>>> d.html('<span>Youhou !</span>')
[<div>]
>>> print(d)
<div><span>Youhou !</span></div>
PyQuery.
insertAfter
(value)-
Alias for
insert_after()
PyQuery.
insertBefore
(value)-
Alias for
insert_before()
PyQuery.
insert_after
(value)-
insert nodes after value
PyQuery.
insert_before
(value)-
insert nodes before value
PyQuery.
is_
(selector)-
Returns True if selector matches at least one current element, else False:
>>> d = PyQuery('<p class="hello"><span>Hi</span></p><p>Bye</p>')
>>> d('p').eq(0).is_('.hello')
True>>> d('p').eq(0).is_('span')
False>>> d('p').eq(1).is_('.hello')
False
PyQuery.
items
(selector=None)-
Iter over elements. Return PyQuery objects:
>>> d = PyQuery('<div><span>foo</span><span>bar</span></div>')
>>> [i.text() for i in d.items('span')]
['foo', 'bar']
>>> [i.text() for i in d('span').items()]
['foo', 'bar']
>>> list(d.items('a')) == list(d('a').items())
True
PyQuery.
make_links_absolute
(base_url=None)-
Make all links absolute.
PyQuery.
map
(func)-
Returns a new PyQuery after transforming current items with func.
func should take two arguments - ‘index’ and ‘element’. Elements can also be referred to as ‘this’ inside of func:
>>> d = PyQuery('<p class="hello">Hi there</p><p>Bye</p><br />')
>>> d('p').map(lambda i, e: PyQuery(e).text())
['Hi there', 'Bye'] >>> d('p').map(lambda i, e: len(PyQuery(this).text()))
[8, 3] >>> d('p').map(lambda i, e: PyQuery(this).text().split())
['Hi', 'there', 'Bye']
PyQuery.
nextAll
(selector=None)-
Alias for
next_all()
PyQuery.
next_all
(selector=None)-
>>> h = '<span><p class="hello">Hi</p><p>Bye</p><img scr=""/></span>'
>>> d = PyQuery(h)
>>> d('p:last').next_all()
[<img>]
>>> d('p:last').nextAll()
[<img>]
PyQuery.
not_
(selector)-
Return elements that don’t match the given selector:
>>> d = PyQuery('<p class="hello">Hi</p><p>Bye</p><div></div>')
>>> d('p').not_('.hello')
[<p>]
PyQuery.
outerHtml
()-
Alias for
outer_html()
PyQuery.
outer_html
()-
Get the html representation of the first selected element:
>>> d = PyQuery('<div><span class="red">toto</span> rocks</div>')
>>> print(d('span'))
<span class="red">toto</span> rocks
>>> print(d('span').outer_html())
<span class="red">toto</span>
>>> print(d('span').outerHtml())
<span class="red">toto</span> >>> S = PyQuery('<p>Only <b>me</b> & myself</p>')
>>> print(S('b').outer_html())
<b>me</b>
PyQuery.
parents
(selector=None)-
>>> d = PyQuery('<span><p class="hello">Hi</p><p>Bye</p></span>')
>>> d('p').parents()
[<span>]
>>> d('.hello').parents('span')
[<span>]
>>> d('.hello').parents('p')
[]
PyQuery.
prepend
(value)-
prepend value to nodes
PyQuery.
prependTo
(value)-
Alias for
prepend_to()
PyQuery.
prepend_to
(value)-
prepend nodes to value
PyQuery.
prevAll
(selector=None)-
Alias for
prev_all()
PyQuery.
prev_all
(selector=None)-
>>> h = '<span><p class="hello">Hi</p><p>Bye</p><img scr=""/></span>'
>>> d = PyQuery(h)
>>> d('p:last').prev_all()
[<p.hello>]
>>> d('p:last').prevAll()
[<p.hello>]
PyQuery.
remove
(expr=<NoDefault>)-
Remove nodes:
>>> h = '<div>Maybe <em>she</em> does <strong>NOT</strong> know</div>'
>>> d = PyQuery(h)
>>> d('strong').remove()
[<strong>]
>>> print(d)
<div>Maybe <em>she</em> does know</div>
PyQuery.
removeAttr
(name)-
Alias for
remove_attr()
PyQuery.
removeClass
(value)-
Alias for
remove_class()
PyQuery.
remove_attr
(name)-
Remove an attribute:
>>> d = PyQuery('<div id="myid"></div>')
>>> d.remove_attr('id')
[<div>]
>>> d.removeAttr('id')
[<div>]
PyQuery.
remove_class
(value)-
Remove a css class to elements:
>>> d = PyQuery('<div class="myclass"></div>')
>>> d.remove_class('myclass')
[<div>]
>>> d.removeClass('myclass')
[<div>]
PyQuery.
remove_namespaces
()-
Remove all namespaces:
>>> doc = PyQuery('<foo xmlns="http://example.com/foo"></foo>')
>>> doc
[<{http://example.com/foo}foo>]
>>> doc.remove_namespaces()
[<foo>]
PyQuery.
replaceAll
(expr)-
Alias for
replace_all()
PyQuery.
replaceWith
(value)-
Alias for
replace_with()
PyQuery.
replace_all
(expr)-
replace nodes by expr
PyQuery.
replace_with
(value)-
replace nodes by value:
>>> doc = PyQuery("<html><div /></html>")
>>> node = PyQuery("<span />")
>>> child = doc.find('div')
>>> child.replace_with(node)[<div>] >>> print(doc) <html><span/></html>
PyQuery.
root
-
return the xml root element
PyQuery.
show
()-
add display:block to elements style
>>> print(PyQuery('<div />').show())
<div style="display: block"/>
PyQuery.
siblings
(selector=None)-
>>> h = '<span><p class="hello">Hi</p><p>Bye</p><img scr=""/></span>'
>>> d = PyQuery(h)
>>> d('.hello').siblings()
[<p>, <img>]
>>> d('.hello').siblings('img')
[<img>]
PyQuery.
text
(value=<NoDefault>)-
Get or set the text representation of sub nodes.
Get the text value:
>>> doc = PyQuery('<div><span>toto</span><span>tata</span></div>')
>>> print(doc.text())
toto tataSet the text value:
>>> doc.text('Youhou !')
[<div>]
>>> print(doc)
<div>Youhou !</div>
PyQuery.
toggleClass
(value)-
Alias for
toggle_class()
PyQuery.
toggle_class
(value)-
Toggle a css class to elements
>>> d = PyQuery('<div></div>')
>>> d.toggle_class('myclass')
[<div.myclass>]
>>> d.toggleClass('myclass')
[<div>]
PyQuery.
val
(value=<NoDefault>)-
Set the attribute value:
>>> d = PyQuery('<input />')
>>> d.val('Youhou')
[<input>]Get the attribute value:
>>> d.val()
'Youhou'
PyQuery.
width
(value=<NoDefault>)-
set/get width of element
PyQuery.
wrap
(value)-
A string of HTML that will be created on the fly and wrapped around each target:
>>> d = PyQuery('<span>youhou</span>')
>>> d.wrap('<div></div>')
[<div>]
>>> print(d)
<div><span>youhou</span></div>
PyQuery.
wrapAll
(value)-
Alias for
wrap_all()
PyQuery.
wrap_all
(value)-
Wrap all the elements in the matched set into a single wrapper element:
>>> d = PyQuery('<div><span>Hey</span><span>you !</span></div>')
>>> print(d('span').wrap_all('<div id="wrapper"></div>'))
<div id="wrapper"><span>Hey</span><span>you !</span></div> >>> d = PyQuery('<div><span>Hey</span><span>you !</span></div>')
>>> print(d('span').wrapAll('<div id="wrapper"></div>'))
<div id="wrapper"><span>Hey</span><span>you !</span></div>
PyQuery.
xhtml_to_html
()-
Remove xhtml namespace:
>>> doc = PyQuery(
... '<html xmlns="http://www.w3.org/1999/xhtml"></html>')
>>> doc
[<{http://www.w3.org/1999/xhtml}html>]
>>> doc.xhtml_to_html()
[<html>] 项目中的使用str(PQ)和PQ.outer_html()都会将部分便签由<tag></tag>变为<tag />,神奇的是有些标签变为这样形式后,
这样的字符串浏览器会解析不出来。'<!DOCTYPE html>'字符串会在变为PQ对象后自动剔除了。
- class
pyQuery的更多相关文章
- pyquery的问题
在使用pyquery时发现一些问题, 1.爬取的html中如果有较多的错误时,不能很好的补全. 2.如果要获取某个class中的内容时,如果内容太多不能取完整!只能取一部分. 这个在现在的最新版本中还 ...
- python爬虫神器PyQuery的使用方法
你是否觉得 XPath 的用法多少有点晦涩难记呢? 你是否觉得 BeautifulSoup 的语法多少有些悭吝难懂呢? 你是否甚至还在苦苦研究正则表达式却因为少些了一个点而抓狂呢? 你是否已经有了一些 ...
- windows下python安装pyquery
安装pyquery之前首先要明确一点,easyinstall 是一款python包管理器,类似于node的npm,用于安装python的扩展包,它安装的包是以*.egg的方式. 要安装pq需要经历以下 ...
- Python开发包推荐系列之xml、html解析器PyQuery
使用python,喜欢她的简洁是一方面,另外就是它有着丰富的开发包 好用又方便 接下来会给大家推荐一系列很赞的开发包. 在解析html.xml过程中,我们有不少的包可以用.比如bs.lxml.xmlt ...
- python - PyQuery
偶尔的机会,知道这么个扩展,手贱翻了下文档,发现似乎挺有意思,遂记录一二. what: 这是一个python版本的jquery,而且是后端执行的,至少官方是这么说的: pyquery allows y ...
- 【pyQuery】抓取startup news首页
#! /usr/bin/python # coding: utf-8 from pyquery import PyQuery c=PyQuery('http://news.dbanotes.net/' ...
- 【pyQuery分析实例】分析体育网冠军联盟比赛成绩
目标地址:http://www.espncricinfo.com/champions-league-twenty20-2012/engine/match/574265.html liz@nb-liz: ...
- 【PyQuery】PyQuery总结
pyquery库是jQuery的Python实现,可以用于解析HTML网页内容, 官方文档地址是:http://packages.python.org/pyquery/. 二.使用方法 ? 1 fro ...
- win7下python安装pyquery
安装pyquery之前首先要明确一点,easyinstall 是一款python包管理器,类似于node的npm,用于安装python的扩展包,它安装的包是以*.egg的方式. 要安装pq需要经历以下 ...
- Python抓取页面中超链接(URL)的三中方法比较(HTMLParser、pyquery、正则表达式) <转>
Python抓取页面中超链接(URL)的3中方法比较(HTMLParser.pyquery.正则表达式) HTMLParser版: #!/usr/bin/python # -*- coding: UT ...
随机推荐
- 沈逸老师PHP魔鬼特训笔记(7)--我叫什么名字
一.生成文件夹. mkdir():--新建目录 bool mkdir ( string $pathname [, int $mode = 0777 [, bool $recursive = false ...
- uva 12544 无向图最小环
思路:这题的N有500,直接floyd肯定超时. 我的做法是每次枚举一个点,求出包含这个点的最小环. 对所有最小环取最小值.求包含某个点的最小环我用的是启发式搜索,先以该点求一次spfa,然后dfs解 ...
- ant 命令学习详解
ant -verbose //输出 ant编译时的详情
- 【Mongodb】---Scheme和Collections对应问题
Mongodb通过mongoose来与数据进行操作.而mongoose是通过model来创建数据库中对应的collection mongoose.model('User', UserSchema); ...
- C#垃圾回收机制
C#属于托管的面相对象的语言,内存回收机制就是一个代表, C#有一套类似"全自动"的垃圾回收机制,也就是虚拟机会自动来判断执行内存的回收, 我们一般常用的Dispose(),Usi ...
- HTML之正则表达式
匹配国内电话号码:d{3}-d{8}|d{4}-d{7} 评注:匹配形式如 0511-4405222 或 021-87888822 匹配腾讯QQ号:[1-9][0-9]{4,} 评注:腾讯QQ号从10 ...
- C#完全无客户端访问Oracle
网上太多的C#无客户端访问oracle案例,经我测试无一成功,特将我在oracle官网上和自己琢磨总结,终于成功,废话不多说,直接上项目. 一,准备条件 (由于我这里是用的控制台程序来测试的,所以将上 ...
- 教-----------有时候就是那么纠结,教的时候不提问题,好像很懂,最后又来纠缠你!真是ctmb
A热心满满教导B 几分钟后...B”都懂了“ B几分钟后.又把你叫来.(走过去,您好,我还有个问题,能帮我回答下吗?不耽误你多少时间,) A已经走一段距离 , 思维已经在别的事情上了,变得好没有耐 ...
- Windows删除大文件
Temp是目录 或者是 文件很大很大很大很大 cmd rd /s /q Temp
- NSString字符操作
1.常用创建初始化方法 1.NSString *string0 = @"string"; 2.NSString *string1 = [NSString stringWithFor ...