16 Finding a Protein Motif
Problem
To allow for the presence of its varying forms, a protein motif is represented by a shorthand as follows: [XY] means "either X or Y" and {X} means "any amino acid except X." For example, the N-glycosylation motif is written as N{P}[ST]{P}.
You can see the complete description and features of a particular protein by its access ID "uniprot_id" in the UniProt database, by inserting the ID number into
http://www.uniprot.org/uniprot/uniprot_id
Alternatively, you can obtain a protein sequence in FASTA format by following
http://www.uniprot.org/uniprot/uniprot_id.fasta
For example, the data for protein B5ZC00 can be found at http://www.uniprot.org/uniprot/B5ZC00.
Given: At most 15 UniProt Protein Database access IDs.
Return: For each protein possessing the N-glycosylation motif, output its given access ID followed by a list of locations in the protein string where the motif can be found.
Sample Dataset
A2Z669
B5ZC00
P07204_TRBM_HUMAN
P20840_SAG1_YEAST
Sample Output
B5ZC00
85 118 142 306 395
P07204_TRBM_HUMAN
47 115 116 382 409
P20840_SAG1_YEAST
79 109 135 248 306 348 364 402 485 501 614
#coding=utf-8
import urllib2
import re
list = ['A2Z669','B5ZC00','P07204_TRBM_HUMAN','P20840_SAG1_YEAST'] for one in list:
name = one.strip('\n')
url = 'http://www.uniprot.org/uniprot/'+name+'.fasta'
req = urllib2.Request(url)
response = urllib2.urlopen(req)
the_page = response.read()
start = the_page.find('\nM')
seq = the_page[start+1:].replace('\n','')
seq = ' '+seq
regex = re.compile(r'N(?=[^P][ST][^P])')
index = 0
out = []
'''
out = [m.start() for m in re.finditer(regex, seq)]
''' index = 0
while(index<len(seq)):
index += 1 if re.search(regex,seq[index:]) == None:
break #print S[index:]
if re.match(regex,seq[index:]) != None:
out.append(index) if out != []:
print name
print ' '.join([ str(i) for i in out])
16 Finding a Protein Motif的更多相关文章
- 14 Finding a Shared Motif
Problem A common substring of a collection of strings is a substring of every member of the collecti ...
- 09 Finding a Motif in DNA
Problem Given two strings ss and tt, tt is a substring of ss if tt is contained as a contiguous coll ...
- ERROR: openstack Error finding address for http://10.16.37.215:9292/v1/images: [Errno 32] Broken pipe
Try to set: no_proxy=10.16.37.215 this should help 转自: http://askubuntu.com/questions/575938/error-i ...
- DNA motif 搜索算法总结
DNA motif 搜索算法总结 2011-09-15 ~ ADMIN 翻译自:A survey of DNA motif finding algorithms, Modan K Das et. al ...
- P6 EPPM Installation and Configuration Guide 16 R1 April 2016
P6 EPPM Installation and Configuration Guide 16 R1 April 2016 Contents About Installing and ...
- P6 Professional Installation and Configuration Guide (Microsoft SQL Server Database) 16 R1
P6 Professional Installation and Configuration Guide (Microsoft SQL Server Database) 16 R1 May ...
- LOJ Finding LCM(math)
1215 - Finding LCM Time Limit: 2 second(s) Memory Limit: 32 MB LCM is an abbreviation used for Least ...
- 越狱Season 1- Episode 16
Season 1, Episode 16 -Burrows:Don't be. It's not your fault. 不要,不是你的错 -Fernando: Know what I like? 知 ...
- 查找EBS中各种文件版本(Finding File Versions in the Oracle Applications EBusiness Suite - Checking the $HEADER)
Finding File Versions in the Oracle Applications EBusiness Suite - Checking the $HEADER (文档 ID 85895 ...
随机推荐
- postgres 使用存储过程批量插入数据
參考资料(pl/pgsql 官方文档): http://www.postgresql.org/docs/9.3/static/plpgsql.html create or replace functi ...
- Cocos2d-x调用Java 代码
Java代码: package com.dishu; import com.dishu.org.R; import android.app.Activity; import android.app.A ...
- macdown在mac OS 中的配置
macdown 用命令行打开.md文件 执行两条命令即可. sudo echo "open -a MacDown \$*" > /usr/local/bin/macdown ...
- 为eclipse安装python、shell开发环境和SVN插件
http://www.crazyant.net/1185.html 为eclipse安装python.shell开发环境和SVN插件 2013/08/27 by Crazyant 暂无评论 eclip ...
- Erlang process structure -- refc binary
Erlang 的process 是虚拟机层面的进程,每个Erlang process 都包括一个 pcb(process control block), 一个stack 以及私有heap . 这部分的 ...
- 1021 docker prometheus监控体系
jmeter plugin监控的信息很少,只有cpu.内存.网络IO,但这些是不够的.例如对于分析mysql数据库的慢查询.最大连接数等更加细密度的信息. 服务端稳定测试的三个前提: 1.应用级别的自 ...
- JS截取字符串常用方法详细整理&&MYSQL
截取字符串的使用比较广泛,有很多中方法,本文粗略的整理了一些,感兴趣的额朋友可以才参考下 使用 substring()或者slice() 函数:split() 功能:使用一个指定的分隔符把一个字符串分 ...
- 基于DB的编程
现在我们大多数的开发都是基于数据库,虽然数据库为我们提供了事务机制,保证存储的数据的ACID,但是,当我们在完成一个业务操作时,涉及到对数据库的大量操作,如果把这些操作在一个事务中,肯定是安全的,但是 ...
- uva-10815-字符串排序
又偷懒了,字符串排序,贱贱的用了std:map #include <iostream> #include <sstream> #include<algorithm> ...
- js中的event
event代表事件的状态,例如触发event对象的元素.鼠标的位置及状态.按下的键等等.event对象只在事件发生的过程中才有效.event的某些属性只对特定的事件有意义.比如,fromElement ...