用正则表达式匹配用rdf3x处理过后的TTL格式文档
1、比如下面这个用rdf3x处理过后的TTL文档片段:
注意缩进的是两个空格
<http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363853> <http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite> <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2622>.
<http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.ebi.ac.uk/terms/chembl#BindingSite>;
<http://www.w3.org/2000/01/rdf-schema#label> "CHEMBL_BS_2659";
<http://rdf.ebi.ac.uk/terms/chembl#chemblId> "CHEMBL_BS_2659";
<http://rdf.ebi.ac.uk/terms/chembl#hasTarget> <http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965>;
<http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName> "30S ribosomal protein S1".
<http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965> <http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite> <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659> , <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623>.
<http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.ebi.ac.uk/terms/chembl#BindingSite>;
<http://www.w3.org/2000/01/rdf-schema#label> "CHEMBL_BS_2623";
<http://rdf.ebi.ac.uk/terms/chembl#chemblId> "CHEMBL_BS_2623";
<http://rdf.ebi.ac.uk/terms/chembl#hasTarget> <http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965>;
<http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName> "16S/23S ribosomal RNA interface".
<http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.ebi.ac.uk/terms/chembl#BindingSite>;
<http://www.w3.org/2000/01/rdf-schema#label> "CHEMBL_BS_2624";
<http://rdf.ebi.ac.uk/terms/chembl#chemblId> "CHEMBL_BS_2624";
<http://rdf.ebi.ac.uk/terms/chembl#hasTarget> <http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022>;
<http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName> "23S ribosomal RNA".
<http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022> <http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite> <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624>.
2、Java编写的正则表达式代码
代码里注释的部分和上面那行是输出三种所需的不同结果
package com.jena; import java.io.BufferedReader;
import java.io.FileReader;
import java.util.regex.Matcher;
import java.util.regex.Pattern; public class rdfReader3 {
static String url=""; public static void main(String[] args) {
FileReader fr=null;
BufferedReader br=null;
try{
fr=new FileReader("C:/Users/Don/workspace/Jena/src/com/jena/bindingsite");
br=new BufferedReader(fr);
String s=" ";
StringBuffer str=new StringBuffer();
while((s=br.readLine())!=null){
Pattern p= Pattern.compile("<([^<>]*)>"); //匹配所有尖括号里的内容
// Pattern p= Pattern.compile("^\n*<([^<>]*)>"); //匹配每一个主语,开头匹配“除了空格所有字符”,后面匹配"<>里的所有内容,内容为非尖括号"
// Pattern p= Pattern.compile(" <([^<>]*)>"); //匹配“两个空格开头”,后面匹配"<>里的所有内容,内容为非尖括号"
Matcher m=p.matcher(s); while(m.find()){
System.out.println(m.group(1));
}
} }catch(Exception e){
System.out.println(e.getMessage());
} } }
(1)匹配所有尖括号里的内容
运行结果
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363853
http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2622
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://rdf.ebi.ac.uk/terms/chembl#BindingSite
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://rdf.ebi.ac.uk/terms/chembl#BindingSite
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://rdf.ebi.ac.uk/terms/chembl#BindingSite
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022
http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624
(2)匹配每一个主语,即开头不是两个空格的那一行数据的第一对尖括号里的内容
运行结果
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363853
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022
(3)匹配“两个空格开头”,后面匹配"<>里的所有内容,内容为非尖括号"
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
匹配前面两个空格开始的数据时,在前面直接输入两个空格即可
Pattern p= Pattern.compile(" <([^<>]*)>");
用正则表达式匹配用rdf3x处理过后的TTL格式文档的更多相关文章
- 【正则】精通JS正则表达式,没消化 信息量太大,好文
http://www.jb51.net/article/25313.htm 正则表达式可以: •测试字符串的某个模式.例如,可以对一个输入字符串进行测试,看在该字符串是否存在一个电话号码模式或一个信用 ...
- [转载]正则表达式参考文档 - Regular Expression Syntax Reference.
正则表达式参考文档 - Regular Expression Syntax Reference. [原创文章,转载请保留或注明出处:http://www.regexlab.com/zh/regref. ...
- java: (正则表达式,XML文档,DOM和DOM4J解析方法)
常见的XML解析技术: 1.DOM(基于XML树结构,比较耗资源,适用于多次访问XML): 2.SAX(基于事件,消耗资源小,适用于数量较大的XML): 3.JDOM(比DOM更快,JDOM仅使用具体 ...
- 使用C#动态生成Word文档/Excel文档的程序测试通过后,部署到IIS服务器上,不能正常使用的问题解决方案
使用C#动态生成Word文档/Excel文档的程序功能调试.测试通过后,部署到服务器上,不能正常使用的问题解决方案: 原因: 可能asp.net程序或iis访问excel组件时权限不够(Ps:Syst ...
- Java进阶(十九)利用正则表达式批处理含链接内容文档
利用正则表达式批处理含链接内容文档 由于项目需求,自己需要将带有链接的标签去除,例如 <a href="/zhaoyao/17-66.html">头晕</a> ...
- 通过编写PHP代码并运用“正则表达式”来实现对试题文档进行去重复、排序
通过编写PHP代码并运用“正则表达式”来实现对试题文档进行去重复.排序 <?php $subject = file_get_contents('test.txt'); $pattern = '/ ...
- 用正则表达式输出rdf文档的三元组格式数据
占个位置 1.输出所有尖括号里的内容 package com.jena; import java.io.BufferedReader; import java.io.FileReader; impor ...
- 正则表达式实现将html文本转换为纯文本格式(将html字符串转换为纯文本方法)
Regex regex = new Regex("<.+?>", RegexOptions.IgnoreCase); string strOutput = regex. ...
- 用正则表达式处理一个复杂字符串(类似json格式)
#利用正则输出{}中的内容 str1="""var local=[{provinceCode:'310000', cityCode:'310100', text: ...
随机推荐
- 乘积尾零|2018年蓝桥杯B组题解析第三题-fishers
标题:乘积尾零 如下的10行数据,每行有10个整数,请你求出它们的乘积的末尾有多少个零? 5650 4542 3554 473 946 4114 3871 9073 90 4329 2758 7949 ...
- Java 文件夹递归遍历
import java.io.File; public class Demo1 { public static void main(String[] args) { File dir=new File ...
- python正则表达式re模块详细介绍--转载
本模块提供了和Perl里的正则表达式类似的功能,不关是正则表达式本身还是被搜索的字符串,都可以是Unicode字符,这点不用担心,python会处理地和Ascii字符一样漂亮. 正则表达式使用反斜杆( ...
- Java Selenium 笔记
目录一.基本语句 1.循环控制(break,continue) 3.字符的替换(replace,repalceFirst,replaceAll,regex) 4.字符串的连接("+" ...
- jquery作业 教授答案
http://www.cnblogs.com/qianjinyan/p/8961086.html 题目要求: 1. 通过jquery动态的创建一个表格,随机生成(id自增,name随机2-3个中文汉字 ...
- 《剑指offer》第二十九题(顺时针打印矩阵)
// 面试题29:顺时针打印矩阵 // 题目:输入一个矩阵,按照从外向里以顺时针的顺序依次打印出每一个数字. #include <iostream> void PrintMatrixInC ...
- 算法笔记--字符串hash
概述: 主要用于字符串的匹配. 定义hash函数: H(c)=(c1bm-1 +c2bm-2 +...+cmb0)mod h 对于字符串c中l-r区间的hash值: H(l,r)=H(1,r)-H(1 ...
- HDU 6106 Classes
Classes 思路:a中包含的元素:只参加a的,只参加a且b的,只参加a且c的,只参加a且b且c的: b中包含的元素:只参加b的,只参加a且b的,只参加b且c的,只参加a且b且c的: c中包含的元素 ...
- Codeforces 447D - DZY Loves Modification
447D - DZY Loves Modification 思路:将行和列分开考虑.用优先队列,计算出行操作i次的幸福值r[i],再计算出列操作i次的幸福值c[i].然后将行取i次操作和列取k-i次操 ...
- git/ssh备查文档
配置多个ssh key: 待更新 git速查表: git remote set-url origin(远程仓库名称) https://xxxxx/ProjectName.git 从ssh切换至htt ...