1、比如下面这个用rdf3x处理过后的TTL文档片段:

注意缩进的是两个空格

<http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363853> <http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite> <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2622>.
<http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.ebi.ac.uk/terms/chembl#BindingSite>;
<http://www.w3.org/2000/01/rdf-schema#label> "CHEMBL_BS_2659";
<http://rdf.ebi.ac.uk/terms/chembl#chemblId> "CHEMBL_BS_2659";
<http://rdf.ebi.ac.uk/terms/chembl#hasTarget> <http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965>;
<http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName> "30S ribosomal protein S1".
<http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965> <http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite> <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659> , <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623>.
<http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.ebi.ac.uk/terms/chembl#BindingSite>;
<http://www.w3.org/2000/01/rdf-schema#label> "CHEMBL_BS_2623";
<http://rdf.ebi.ac.uk/terms/chembl#chemblId> "CHEMBL_BS_2623";
<http://rdf.ebi.ac.uk/terms/chembl#hasTarget> <http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965>;
<http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName> "16S/23S ribosomal RNA interface".
<http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.ebi.ac.uk/terms/chembl#BindingSite>;
<http://www.w3.org/2000/01/rdf-schema#label> "CHEMBL_BS_2624";
<http://rdf.ebi.ac.uk/terms/chembl#chemblId> "CHEMBL_BS_2624";
<http://rdf.ebi.ac.uk/terms/chembl#hasTarget> <http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022>;
<http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName> "23S ribosomal RNA".
<http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022> <http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite> <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624>.

2、Java编写的正则表达式代码

代码里注释的部分和上面那行是输出三种所需的不同结果

package com.jena;

import java.io.BufferedReader;
import java.io.FileReader;
import java.util.regex.Matcher;
import java.util.regex.Pattern; public class rdfReader3 {
static String url=""; public static void main(String[] args) {
FileReader fr=null;
BufferedReader br=null;
try{
fr=new FileReader("C:/Users/Don/workspace/Jena/src/com/jena/bindingsite");
br=new BufferedReader(fr);
String s=" ";
StringBuffer str=new StringBuffer();
while((s=br.readLine())!=null){
Pattern p= Pattern.compile("<([^<>]*)>"); //匹配所有尖括号里的内容
// Pattern p= Pattern.compile("^\n*<([^<>]*)>"); //匹配每一个主语,开头匹配“除了空格所有字符”,后面匹配"<>里的所有内容,内容为非尖括号"
// Pattern p= Pattern.compile(" <([^<>]*)>"); //匹配“两个空格开头”,后面匹配"<>里的所有内容,内容为非尖括号"
Matcher m=p.matcher(s); while(m.find()){
System.out.println(m.group(1));
}
} }catch(Exception e){
System.out.println(e.getMessage());
} } }

(1)匹配所有尖括号里的内容

运行结果

http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363853
http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2622
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://rdf.ebi.ac.uk/terms/chembl#BindingSite
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://rdf.ebi.ac.uk/terms/chembl#BindingSite
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://rdf.ebi.ac.uk/terms/chembl#BindingSite
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022
http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624

(2)匹配每一个主语,即开头不是两个空格的那一行数据的第一对尖括号里的内容

运行结果

http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363853
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022

(3)匹配“两个空格开头”,后面匹配"<>里的所有内容,内容为非尖括号"

http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName

匹配前面两个空格开始的数据时,在前面直接输入两个空格即可

  Pattern p= Pattern.compile("  <([^<>]*)>"); 

用正则表达式匹配用rdf3x处理过后的TTL格式文档的更多相关文章

  1. 【正则】精通JS正则表达式,没消化 信息量太大,好文

    http://www.jb51.net/article/25313.htm 正则表达式可以: •测试字符串的某个模式.例如,可以对一个输入字符串进行测试,看在该字符串是否存在一个电话号码模式或一个信用 ...

  2. [转载]正则表达式参考文档 - Regular Expression Syntax Reference.

    正则表达式参考文档 - Regular Expression Syntax Reference. [原创文章,转载请保留或注明出处:http://www.regexlab.com/zh/regref. ...

  3. java: (正则表达式,XML文档,DOM和DOM4J解析方法)

    常见的XML解析技术: 1.DOM(基于XML树结构,比较耗资源,适用于多次访问XML): 2.SAX(基于事件,消耗资源小,适用于数量较大的XML): 3.JDOM(比DOM更快,JDOM仅使用具体 ...

  4. 使用C#动态生成Word文档/Excel文档的程序测试通过后,部署到IIS服务器上,不能正常使用的问题解决方案

    使用C#动态生成Word文档/Excel文档的程序功能调试.测试通过后,部署到服务器上,不能正常使用的问题解决方案: 原因: 可能asp.net程序或iis访问excel组件时权限不够(Ps:Syst ...

  5. Java进阶(十九)利用正则表达式批处理含链接内容文档

    利用正则表达式批处理含链接内容文档 由于项目需求,自己需要将带有链接的标签去除,例如 <a href="/zhaoyao/17-66.html">头晕</a> ...

  6. 通过编写PHP代码并运用“正则表达式”来实现对试题文档进行去重复、排序

    通过编写PHP代码并运用“正则表达式”来实现对试题文档进行去重复.排序 <?php $subject = file_get_contents('test.txt'); $pattern = '/ ...

  7. 用正则表达式输出rdf文档的三元组格式数据

    占个位置 1.输出所有尖括号里的内容 package com.jena; import java.io.BufferedReader; import java.io.FileReader; impor ...

  8. 正则表达式实现将html文本转换为纯文本格式(将html字符串转换为纯文本方法)

    Regex regex = new Regex("<.+?>", RegexOptions.IgnoreCase); string strOutput = regex. ...

  9. 用正则表达式处理一个复杂字符串(类似json格式)

    #利用正则输出{}中的内容 str1="""var local=[{provinceCode:'310000',   cityCode:'310100',   text: ...

随机推荐

  1. E: could not get lock /var/lib/dpkg/lock-frontend - open (11: Resource temporary unavailable) E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), is an other process using it

    1. 问题详细提示如下: E: Could not get lock /var/lib/dpkg/lock-frontend - open (11: Resource temporarly unava ...

  2. Asp.net简单概念知识

    1. 简述 private. protected. public. internal 修饰符的访问权限.答 . private :   私有成员, 在类的内部才可以访问.      protected ...

  3. POJ1061 青蛙的约会(扩展欧几里得)题解

    Description 两只青蛙在网上相识了,它们聊得很开心,于是觉得很有必要见一面.它们很高兴地发现它们住在同一条纬度线上,于是它们约定各自朝西跳,直到碰面为止.可是它们出发之前忘记了一件很重要的事 ...

  4. User-Defined Table Types 用户自定义表类型

    Location 数据库--可编程性--类型--用户定义表类型 select one database--> programmability-->types-->user--defi ...

  5. 事务(Transaction)

    1.演示转账的功能:(1)创建一张表示学生表表 CREATE TABLE student( id INT PRIMARY KEY AUTO_INCREMENT,name VARCHAR(50), ac ...

  6. Git operate

    新建远程分支和删除 https://www.jianshu.com/p/ea1dab2de419 使用git branch -a查看所有分支 远程先开好分支然后拉到本地 git checkout -b ...

  7. python正则表达式re模块详细介绍--转载

    本模块提供了和Perl里的正则表达式类似的功能,不关是正则表达式本身还是被搜索的字符串,都可以是Unicode字符,这点不用担心,python会处理地和Ascii字符一样漂亮. 正则表达式使用反斜杆( ...

  8. 基于MySQl的分页显示

    <%@page import="java.sql.DriverManager"%> <%@page import="java.sql.ResultSet ...

  9. go 内建变量类型

    bool,string (u)int,(u)int8,(u)int16,(u)int32,(u)int64,uintptr(指针) byte,rune(字符) float32,float64,comp ...

  10. 百度编辑器 Ueditor 上传图片时打开文件夹的延迟问题,点击上传图片弹窗打开慢问题

      在使用 ueditor 开发时, 作为一个web文本编辑器使用时. 当点击上传图片时, 文件夹要延迟好久才能打开. 解决: 针对多图片上传, 将/ueditor/dialogs/image/ima ...