利用wikipedia 的API实现对其内容的查询
wikipedia提供了api可以供我们对其内容进行操作。其API文档地址为:
http://en.wikipedia.org/w/api.php
列举一些常见用法:
1、全文搜索
http://en.wikipedia.org/w/api.php?action=query&list=search&srsearch=fluoxetine
srsearch为要检索的内容
结果:
- <?xml version="1.0"?>
- <api>
- <query>
- <searchinfo totalhits="224" />
- <search>
- <p ns="0" title="Fluoxetine" snippet="<span class='searchmatch'>Fluoxetine</span> (also known by the tradenames Prozac, Sarafem) is an antidepressant of the selective serotonin reuptake inhibitor (SSRI) class <b>...</b> " size="53978" wordcount="7052" timestamp="2010-10-31T23:22:00Z" />
- <p ns="0" title="Olanzapine/fluoxetine" snippet="The drug combination olanzapine/<span class='searchmatch'>fluoxetine</span> (trade name Symbyax, created by Eli Lilly and Company ) is a single capsule containing the <b>...</b> " size="5703" wordcount="629" timestamp="2010-09-21T09:10:34Z" />
- <p ns="0" title="Sertraline" snippet="Evidence suggests that sertraline may work better than <span class='searchmatch'>fluoxetine</span> (Prozac) for some subtypes of depression. Sertraline is highly <b>...</b> " size="104510" wordcount="13933" timestamp="2010-10-28T22:13:04Z" />
- <p ns="0" title="Antidepressant" snippet="The first such compound to be patented was zimelidine in 1971, while the first released clinically was indalpine . <span class='searchmatch'>Fluoxetine</span> was <b>...</b> " size="128712" wordcount="17532" timestamp="2010-10-30T08:05:06Z" />
- <p ns="0" title="Selective serotonin reuptake inhibitor" snippet="four newer antidepressants (including the SSRIs paroxetine and <span class='searchmatch'>fluoxetine</span> , and two non-SSRI antidepressants nefazodone and venlafaxine ). <b>...</b> " size="78327" wordcount="10398" timestamp="2010-11-01T00:11:30Z" />
- <p ns="0" title="Paroxetine" snippet="Unlike two other popular SSRI antidepressants, <span class='searchmatch'>fluoxetine</span> and sertraline , paroxetine is associated with clinically significant weight <b>...</b> " size="48886" wordcount="6491" timestamp="2010-10-31T23:11:12Z" />
- <p ns="0" title="Venlafaxine" snippet="Its efficacy is similar to or better than sertraline (Zoloft) and <span class='searchmatch'>fluoxetine</span> (Prozac), depending on the criteria and rating scales used <b>...</b> " size="49655" wordcount="6574" timestamp="2010-11-01T00:38:00Z" />
- <p ns="0" title="Olanzapine" snippet="Olanzapine (trade names Zyprexa, Zalasta, Zolafren, Olzapin, Oferta, Zypadhera or in combination with <span class='searchmatch'>fluoxetine</span> Symbyax ) is an atypical <b>...</b> " size="34028" wordcount="4540" timestamp="2010-10-30T17:45:42Z" />
- <p ns="0" title="Prozac (disambiguation)" snippet="Prozac is a proprietary name for the antidepressant drug <span class='searchmatch'>fluoxetine</span>. Prozac may also refer to: Prozac+ , an Italian punk band <b>...</b> " size="581" wordcount="78" timestamp="2010-04-23T20:24:31Z" />
- <p ns="0" title="SSRI discontinuation syndrome" snippet="paroxetine having the highest number of withdrawal syndrome reports and <span class='searchmatch'>fluoxetine</span> the highest number of drug dependence reports; the note <b>...</b> " size="41099" wordcount="5444" timestamp="2010-09-23T06:19:55Z" />
- </search>
- </query>
- <query-continue>
- <search sroffset="10" />
- </query-continue>
- </api>
2、列举wikipedia 的 category:
http://en.wikipedia.org/w/api.php?action=query&list=allcategories&acprefix=drug&aclimit=10
返回10条以drug开头的category;
结果:
- <?xml version="1.0"?>
- <api>
- <query>
- <allcategories>
- <c xml:space="preserve">Drug-induced Suicide</c>
- <c xml:space="preserve">Drug-realted suicides</c>
- <c xml:space="preserve">Drug-related Films</c>
- <c xml:space="preserve">Drug-related Suicides</c>
- <c xml:space="preserve">Drug-related death in California</c>
- <c xml:space="preserve">Drug-related deaths</c>
- <c xml:space="preserve">Drug-related deaths by country</c>
- <c xml:space="preserve">Drug-related deaths in Alabama</c>
- <c xml:space="preserve">Drug-related deaths in Alaska</c>
- <c xml:space="preserve">Drug-related deaths in Arizona</c>
- </allcategories>
- </query>
- <query-continue>
- <allcategories acfrom="Drug-related deaths in Arkansas" />
- </query-continue>
- </api>
3、返回具有相应title页面的timestamp|user|comment|content 信息;
结果:
- <?xml version="1.0"?>
- <api>
- <query>
- <pages>
- <page pageid="27697087" ns="0" title="API">
- <revisions>
- <rev user="Graham87" timestamp="2010-06-13T08:41:17Z" comment="Protected API: restore protection ([edit=sysop] (indefinite) [move=sysop] (indefinite))" xml:space="preserve">#REDIRECT [[Application programming interface]]{{R from abbreviation}}</rev>
- </revisions>
- </page>
- </pages>
- </query>
- </api>
4、解析页面:
http://en.wikipedia.org/w/api.php?action=parse&format=xml&page=fluoxetine
用上面的查询返回的[content]是wikipedia的标记格式,这个api返回的是html格式的文本:
可以用xpath="api/parse/text" 返回html内容。
* action=parse *
This module parses wikitext and returns parser output
This module requires read rights.
Parameters:
title - Title of page the text belongs to
Default: API
text - Wikitext to parse
summary - Summary to parse
page - Parse the content of this page. Cannot be used together with text and title
redirects - If the page parameter is set to a redirect, resolve it
oldid - Parse the content of this revision. Overrides page
prop - Which pieces of information to get.
NOTE: Section tree is only generated if there are more than 4 sections, or if the __TOC__ keyword is present
Values (separate with '|'): text, langlinks, categories, links, templates, images, externallinks, sections, revid, displaytitle, headitems, headhtml
Default: text|langlinks|categories|links|templates|images|externallinks|sections|revid|displaytitle
pst - Do a pre-save transform on the input before parsing it.
Ignored if page or oldid is used.
onlypst - Do a PST on the input, but don't parse it.
Returns PSTed wikitext. Ignored if page or oldid is used.
Example:
api.php?action=parse&text={{Project:Sandbox}}
来源:http://john2007.iteye.com/blog/800446
利用wikipedia 的API实现对其内容的查询的更多相关文章
- 利用百度地图API实现地址和经纬度互换查询
import json import requests def baiduMap(input_para): headers = { 'User-Agent': 'Mozilla/5.0 (Window ...
- 利用百度词典API和Volley网络库开发的android词典应用
关于百度词典API的说明,地址在这里:百度词典API介绍 关于android网络库Volley的介绍说明,地址在这里:Android网络通信库Volley 首先我们看下大体的界面布局!
- 利用Google Speech API实现Speech To Text
很久很久以前, 网上流传着一个免费的,识别率暴高的,稳定的 Speech To Text API, 那就是Google Speech API. 但是最近再使用的时候,总是返回500 Error. 后来 ...
- 利用未公开API获取终端会话闲置时间(Idle Time)和登入时间(Logon Time)
利用未公开API获取终端会话闲置时间(Idle Time)和登入时间(Logon Time)作者:Tuuzed(土仔) 发表于:2008年3月3日23:12:38 版权声明:可以任意转载,转载时请 ...
- 【百度地图API】建立全国银行位置查询系统(五)——如何更改百度地图的信息窗口内容?
原文:[百度地图API]建立全国银行位置查询系统(五)--如何更改百度地图的信息窗口内容? 摘要: 酷讯.搜房.去哪儿网等大型房产.旅游酒店网站,用的是百度的数据库,却显示了自定义的信息窗口内容,这是 ...
- ASP.NET Core Web APi获取原始请求内容
前言 我们讲过ASP.NET Core Web APi路由绑定,本节我们来讲讲如何获取客户端请求过来的内容. ASP.NET Core Web APi捕获Request.Body内容 [HttpPos ...
- 利用WordPress REST API 开发微信小程序从入门到放弃
自从我发布并开源WordPress版微信小程序以来,很多WordPress网站的站长问有关程序开发的问题,其实在文章:<用微信小程序连接WordPress网站>讲述过一些基本的要点,不过仍 ...
- 利用百度翻译API,获取翻译结果
利用百度翻译API,获取翻译结果 translate.py #!/usr/bin/python #-*- coding:utf-8 -*- import sys reload(sys) sys.set ...
- 白话SpringCloud | 第十一章:路由网关(Zuul):利用swagger2聚合API文档
前言 通过之前的两篇文章,可以简单的搭建一个路由网关了.而我们知道,现在都奉行前后端分离开发,前后端开发的沟通成本就增加了,所以一般上我们都是通过swagger进行api文档生成的.现在由于使用了统一 ...
随机推荐
- 通过js调用android原生方法
有时候我们有这样一个需求,监听html中控件的一些事件.例如点击html中某个按钮,跳转到别的activity,复制某段文本. 首先是对webview的设置: myWebView = (WebView ...
- (转)android 蓝牙通信编程
转自:http://blog.csdn.net/pwei007/article/details/6015907 Android平台支持蓝牙网络协议栈,实现蓝牙设备之间数据的无线传输. 本文档描述了怎样 ...
- grep
http://www.cnblogs.com/ggjucheng/archive/2013/01/13/2856896.html
- 完美解决window.navigator.geolocation.getCurrentPosition,在IOS10系统中无法定位问题
目前由于许多用户都将电话升级到了iOS系统,苹果的iOS 10已经正式对外推送,相信很多用户已经更新到了最新的系统.然而,如果web站没有及时支持https协议的话,当很多用户在iOS 10下访问很多 ...
- 50. 树的子结构[subtree structure in tree]
[本文链接] http://www.cnblogs.com/hellogiser/p/subtree-structure-in-tree.html [题目] 输入两棵二叉树A和B,判断B是不是A的子结 ...
- Spring mvc中@RequestMapping 6个基本用法
Spring mvc中@RequestMapping 6个基本用法 spring mvc中的@RequestMapping的用法. 1)最基本的,方法级别上应用,例如: Java代码 @Reques ...
- AnyImgUpload
@{ ViewBag.Title = "ImgForAny"; Layout = null; } <h2>ImgForAny</h2> <script ...
- 关于 android 的setOnItemClickListener 和 setOnItemLongClickListener 同时触发的解决方法
关于 android 的setOnItemClickListener 和 setOnItemLongClickListener 同时触发的解决方法. 其实方法也是很简单 的主要 setOnItemLo ...
- Transaction (Process ID xxx) was deadlocked on lock
Transaction (Process ID 161) was deadlocked on lock | communication buffer resources with another pr ...
- js实现复选框全选
HTML代码如下: <div> <label><input type="checkbox" name="aAll">全选&l ...