用正则表达式匹配用rdf3x处理过后的TTL格式文档
1、比如下面这个用rdf3x处理过后的TTL文档片段:
注意缩进的是两个空格
<http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363853> <http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite> <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2622>.
<http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.ebi.ac.uk/terms/chembl#BindingSite>;
<http://www.w3.org/2000/01/rdf-schema#label> "CHEMBL_BS_2659";
<http://rdf.ebi.ac.uk/terms/chembl#chemblId> "CHEMBL_BS_2659";
<http://rdf.ebi.ac.uk/terms/chembl#hasTarget> <http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965>;
<http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName> "30S ribosomal protein S1".
<http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965> <http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite> <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659> , <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623>.
<http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.ebi.ac.uk/terms/chembl#BindingSite>;
<http://www.w3.org/2000/01/rdf-schema#label> "CHEMBL_BS_2623";
<http://rdf.ebi.ac.uk/terms/chembl#chemblId> "CHEMBL_BS_2623";
<http://rdf.ebi.ac.uk/terms/chembl#hasTarget> <http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965>;
<http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName> "16S/23S ribosomal RNA interface".
<http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.ebi.ac.uk/terms/chembl#BindingSite>;
<http://www.w3.org/2000/01/rdf-schema#label> "CHEMBL_BS_2624";
<http://rdf.ebi.ac.uk/terms/chembl#chemblId> "CHEMBL_BS_2624";
<http://rdf.ebi.ac.uk/terms/chembl#hasTarget> <http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022>;
<http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName> "23S ribosomal RNA".
<http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022> <http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite> <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624>.
2、Java编写的正则表达式代码
代码里注释的部分和上面那行是输出三种所需的不同结果
package com.jena; import java.io.BufferedReader;
import java.io.FileReader;
import java.util.regex.Matcher;
import java.util.regex.Pattern; public class rdfReader3 {
static String url=""; public static void main(String[] args) {
FileReader fr=null;
BufferedReader br=null;
try{
fr=new FileReader("C:/Users/Don/workspace/Jena/src/com/jena/bindingsite");
br=new BufferedReader(fr);
String s=" ";
StringBuffer str=new StringBuffer();
while((s=br.readLine())!=null){
Pattern p= Pattern.compile("<([^<>]*)>"); //匹配所有尖括号里的内容
// Pattern p= Pattern.compile("^\n*<([^<>]*)>"); //匹配每一个主语,开头匹配“除了空格所有字符”,后面匹配"<>里的所有内容,内容为非尖括号"
// Pattern p= Pattern.compile(" <([^<>]*)>"); //匹配“两个空格开头”,后面匹配"<>里的所有内容,内容为非尖括号"
Matcher m=p.matcher(s); while(m.find()){
System.out.println(m.group(1));
}
} }catch(Exception e){
System.out.println(e.getMessage());
} } }
(1)匹配所有尖括号里的内容
运行结果
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363853
http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2622
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://rdf.ebi.ac.uk/terms/chembl#BindingSite
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://rdf.ebi.ac.uk/terms/chembl#BindingSite
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://rdf.ebi.ac.uk/terms/chembl#BindingSite
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022
http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624
(2)匹配每一个主语,即开头不是两个空格的那一行数据的第一对尖括号里的内容
运行结果
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363853
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022
(3)匹配“两个空格开头”,后面匹配"<>里的所有内容,内容为非尖括号"
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
匹配前面两个空格开始的数据时,在前面直接输入两个空格即可
Pattern p= Pattern.compile(" <([^<>]*)>");
用正则表达式匹配用rdf3x处理过后的TTL格式文档的更多相关文章
- 【正则】精通JS正则表达式,没消化 信息量太大,好文
http://www.jb51.net/article/25313.htm 正则表达式可以: •测试字符串的某个模式.例如,可以对一个输入字符串进行测试,看在该字符串是否存在一个电话号码模式或一个信用 ...
- [转载]正则表达式参考文档 - Regular Expression Syntax Reference.
正则表达式参考文档 - Regular Expression Syntax Reference. [原创文章,转载请保留或注明出处:http://www.regexlab.com/zh/regref. ...
- java: (正则表达式,XML文档,DOM和DOM4J解析方法)
常见的XML解析技术: 1.DOM(基于XML树结构,比较耗资源,适用于多次访问XML): 2.SAX(基于事件,消耗资源小,适用于数量较大的XML): 3.JDOM(比DOM更快,JDOM仅使用具体 ...
- 使用C#动态生成Word文档/Excel文档的程序测试通过后,部署到IIS服务器上,不能正常使用的问题解决方案
使用C#动态生成Word文档/Excel文档的程序功能调试.测试通过后,部署到服务器上,不能正常使用的问题解决方案: 原因: 可能asp.net程序或iis访问excel组件时权限不够(Ps:Syst ...
- Java进阶(十九)利用正则表达式批处理含链接内容文档
利用正则表达式批处理含链接内容文档 由于项目需求,自己需要将带有链接的标签去除,例如 <a href="/zhaoyao/17-66.html">头晕</a> ...
- 通过编写PHP代码并运用“正则表达式”来实现对试题文档进行去重复、排序
通过编写PHP代码并运用“正则表达式”来实现对试题文档进行去重复.排序 <?php $subject = file_get_contents('test.txt'); $pattern = '/ ...
- 用正则表达式输出rdf文档的三元组格式数据
占个位置 1.输出所有尖括号里的内容 package com.jena; import java.io.BufferedReader; import java.io.FileReader; impor ...
- 正则表达式实现将html文本转换为纯文本格式(将html字符串转换为纯文本方法)
Regex regex = new Regex("<.+?>", RegexOptions.IgnoreCase); string strOutput = regex. ...
- 用正则表达式处理一个复杂字符串(类似json格式)
#利用正则输出{}中的内容 str1="""var local=[{provinceCode:'310000', cityCode:'310100', text: ...
随机推荐
- E: could not get lock /var/lib/dpkg/lock-frontend - open (11: Resource temporary unavailable) E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), is an other process using it
1. 问题详细提示如下: E: Could not get lock /var/lib/dpkg/lock-frontend - open (11: Resource temporarly unava ...
- Asp.net简单概念知识
1. 简述 private. protected. public. internal 修饰符的访问权限.答 . private : 私有成员, 在类的内部才可以访问. protected ...
- POJ1061 青蛙的约会(扩展欧几里得)题解
Description 两只青蛙在网上相识了,它们聊得很开心,于是觉得很有必要见一面.它们很高兴地发现它们住在同一条纬度线上,于是它们约定各自朝西跳,直到碰面为止.可是它们出发之前忘记了一件很重要的事 ...
- User-Defined Table Types 用户自定义表类型
Location 数据库--可编程性--类型--用户定义表类型 select one database--> programmability-->types-->user--defi ...
- 事务(Transaction)
1.演示转账的功能:(1)创建一张表示学生表表 CREATE TABLE student( id INT PRIMARY KEY AUTO_INCREMENT,name VARCHAR(50), ac ...
- Git operate
新建远程分支和删除 https://www.jianshu.com/p/ea1dab2de419 使用git branch -a查看所有分支 远程先开好分支然后拉到本地 git checkout -b ...
- python正则表达式re模块详细介绍--转载
本模块提供了和Perl里的正则表达式类似的功能,不关是正则表达式本身还是被搜索的字符串,都可以是Unicode字符,这点不用担心,python会处理地和Ascii字符一样漂亮. 正则表达式使用反斜杆( ...
- 基于MySQl的分页显示
<%@page import="java.sql.DriverManager"%> <%@page import="java.sql.ResultSet ...
- go 内建变量类型
bool,string (u)int,(u)int8,(u)int16,(u)int32,(u)int64,uintptr(指针) byte,rune(字符) float32,float64,comp ...
- 百度编辑器 Ueditor 上传图片时打开文件夹的延迟问题,点击上传图片弹窗打开慢问题
在使用 ueditor 开发时, 作为一个web文本编辑器使用时. 当点击上传图片时, 文件夹要延迟好久才能打开. 解决: 针对多图片上传, 将/ueditor/dialogs/image/ima ...