1、比如下面这个用rdf3x处理过后的TTL文档片段:

注意缩进的是两个空格

<http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363853> <http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite> <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2622>.
<http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.ebi.ac.uk/terms/chembl#BindingSite>;
<http://www.w3.org/2000/01/rdf-schema#label> "CHEMBL_BS_2659";
<http://rdf.ebi.ac.uk/terms/chembl#chemblId> "CHEMBL_BS_2659";
<http://rdf.ebi.ac.uk/terms/chembl#hasTarget> <http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965>;
<http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName> "30S ribosomal protein S1".
<http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965> <http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite> <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659> , <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623>.
<http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.ebi.ac.uk/terms/chembl#BindingSite>;
<http://www.w3.org/2000/01/rdf-schema#label> "CHEMBL_BS_2623";
<http://rdf.ebi.ac.uk/terms/chembl#chemblId> "CHEMBL_BS_2623";
<http://rdf.ebi.ac.uk/terms/chembl#hasTarget> <http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965>;
<http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName> "16S/23S ribosomal RNA interface".
<http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.ebi.ac.uk/terms/chembl#BindingSite>;
<http://www.w3.org/2000/01/rdf-schema#label> "CHEMBL_BS_2624";
<http://rdf.ebi.ac.uk/terms/chembl#chemblId> "CHEMBL_BS_2624";
<http://rdf.ebi.ac.uk/terms/chembl#hasTarget> <http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022>;
<http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName> "23S ribosomal RNA".
<http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022> <http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite> <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624>.

2、Java编写的正则表达式代码

代码里注释的部分和上面那行是输出三种所需的不同结果

package com.jena;

import java.io.BufferedReader;
import java.io.FileReader;
import java.util.regex.Matcher;
import java.util.regex.Pattern; public class rdfReader3 {
static String url=""; public static void main(String[] args) {
FileReader fr=null;
BufferedReader br=null;
try{
fr=new FileReader("C:/Users/Don/workspace/Jena/src/com/jena/bindingsite");
br=new BufferedReader(fr);
String s=" ";
StringBuffer str=new StringBuffer();
while((s=br.readLine())!=null){
Pattern p= Pattern.compile("<([^<>]*)>"); //匹配所有尖括号里的内容
// Pattern p= Pattern.compile("^\n*<([^<>]*)>"); //匹配每一个主语,开头匹配“除了空格所有字符”,后面匹配"<>里的所有内容,内容为非尖括号"
// Pattern p= Pattern.compile(" <([^<>]*)>"); //匹配“两个空格开头”,后面匹配"<>里的所有内容,内容为非尖括号"
Matcher m=p.matcher(s); while(m.find()){
System.out.println(m.group(1));
}
} }catch(Exception e){
System.out.println(e.getMessage());
} } }

(1)匹配所有尖括号里的内容

运行结果

http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363853
http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2622
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://rdf.ebi.ac.uk/terms/chembl#BindingSite
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://rdf.ebi.ac.uk/terms/chembl#BindingSite
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://rdf.ebi.ac.uk/terms/chembl#BindingSite
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022
http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624

(2)匹配每一个主语,即开头不是两个空格的那一行数据的第一对尖括号里的内容

运行结果

http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363853
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022

(3)匹配“两个空格开头”,后面匹配"<>里的所有内容,内容为非尖括号"

http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName

匹配前面两个空格开始的数据时,在前面直接输入两个空格即可

  Pattern p= Pattern.compile("  <([^<>]*)>"); 

用正则表达式匹配用rdf3x处理过后的TTL格式文档的更多相关文章

  1. 【正则】精通JS正则表达式,没消化 信息量太大,好文

    http://www.jb51.net/article/25313.htm 正则表达式可以: •测试字符串的某个模式.例如,可以对一个输入字符串进行测试,看在该字符串是否存在一个电话号码模式或一个信用 ...

  2. [转载]正则表达式参考文档 - Regular Expression Syntax Reference.

    正则表达式参考文档 - Regular Expression Syntax Reference. [原创文章,转载请保留或注明出处:http://www.regexlab.com/zh/regref. ...

  3. java: (正则表达式,XML文档,DOM和DOM4J解析方法)

    常见的XML解析技术: 1.DOM(基于XML树结构,比较耗资源,适用于多次访问XML): 2.SAX(基于事件,消耗资源小,适用于数量较大的XML): 3.JDOM(比DOM更快,JDOM仅使用具体 ...

  4. 使用C#动态生成Word文档/Excel文档的程序测试通过后,部署到IIS服务器上,不能正常使用的问题解决方案

    使用C#动态生成Word文档/Excel文档的程序功能调试.测试通过后,部署到服务器上,不能正常使用的问题解决方案: 原因: 可能asp.net程序或iis访问excel组件时权限不够(Ps:Syst ...

  5. Java进阶(十九)利用正则表达式批处理含链接内容文档

    利用正则表达式批处理含链接内容文档 由于项目需求,自己需要将带有链接的标签去除,例如 <a href="/zhaoyao/17-66.html">头晕</a> ...

  6. 通过编写PHP代码并运用“正则表达式”来实现对试题文档进行去重复、排序

    通过编写PHP代码并运用“正则表达式”来实现对试题文档进行去重复.排序 <?php $subject = file_get_contents('test.txt'); $pattern = '/ ...

  7. 用正则表达式输出rdf文档的三元组格式数据

    占个位置 1.输出所有尖括号里的内容 package com.jena; import java.io.BufferedReader; import java.io.FileReader; impor ...

  8. 正则表达式实现将html文本转换为纯文本格式(将html字符串转换为纯文本方法)

    Regex regex = new Regex("<.+?>", RegexOptions.IgnoreCase); string strOutput = regex. ...

  9. 用正则表达式处理一个复杂字符串(类似json格式)

    #利用正则输出{}中的内容 str1="""var local=[{provinceCode:'310000',   cityCode:'310100',   text: ...

随机推荐

  1. C++中find()函数和rfind()函数的用法

    本文转载自http://blog.csdn.net/youxin2012/article/details/9162415 string中 find()的应用  (rfind() 类似,只是从反向查找) ...

  2. Where is HttpContent.ReadAsAsync?

    It looks like it is an extension method (in System.Net.Http.Formatting): HttpContentExtensions Class ...

  3. 【分词器及自定义】Elasticsearch中文分词器及自定义分词器

    中文分词器 在lunix下执行下列命令,可以看到本来应该按照中文”北京大学”来查询结果es将其分拆为”北”,”京”,”大”,”学”四个汉字,这显然不符合我的预期.这是因为Es默认的是英文分词器我需要为 ...

  4. 风景区的面积及道路状况分析问题 test

    参考文献:   https://wenku.baidu.com/view/b6aed86baf1ffc4ffe47ac92.html #include <bits/stdc++.h> us ...

  5. [异常记录-12]Web Deploy部署:未能连接到远程计算机,请确保在远程计算机上安装了 Web Deploy 并启动了所需的进程("Web Management Service")

    Web Deploy 安装 请参考:图文详解远程部署ASP.NET MVC 5项目 如此安装后还不行,  可以在卸载后重新安装 Web Deploy 时,不要选那个经典还是典型的安装选项,选自定义安装 ...

  6. Linux command line exercises for NGS data processing

    by Umer Zeeshan Ijaz The purpose of this tutorial is to introduce students to the frequently used to ...

  7. ListView事件的研究

    1. ListView的OnItemClickListener不被触发的另外一种情况 如上图,在一个ItemView中,只有一个TextView位于最左侧,他的右侧是空白区域,没有任何控件,当点击右侧 ...

  8. 12月14日 bs-grid , destroy_all()

    bootstrap Grid : The Bs grid system has four classes: xs (phones), sm (tablets), md (desktops), and ...

  9. codeforces 555c// Case of Chocolate// Codeforces Round #310(Div. 1)

    题意:直角边为n的网格巧克力,一格为一块,选择斜边上一点,从左或上吃,直到吃到空气,称为一次操作.给出几个操作,问各能吃几块.如果x是当前要吃的横坐标,在已经吃过的中找x1>=x的第一个x1,即 ...

  10. Confluence 6 完成你的任务

    很好,宇航员们,你已经令人钦佩的展示了你自己的.我们确定你新招募的员工已经对你了解的 Confluence 知识感到赞叹. 在这个指南中,我们已经完成了: 在主面板中对 Confluence 的功能进 ...