16 Finding a Protein Motif
Problem
To allow for the presence of its varying forms, a protein motif is represented by a shorthand as follows: [XY] means "either X or Y" and {X} means "any amino acid except X." For example, the N-glycosylation motif is written as N{P}[ST]{P}.
You can see the complete description and features of a particular protein by its access ID "uniprot_id" in the UniProt database, by inserting the ID number into
http://www.uniprot.org/uniprot/uniprot_id
Alternatively, you can obtain a protein sequence in FASTA format by following
http://www.uniprot.org/uniprot/uniprot_id.fasta
For example, the data for protein B5ZC00 can be found at http://www.uniprot.org/uniprot/B5ZC00.
Given: At most 15 UniProt Protein Database access IDs.
Return: For each protein possessing the N-glycosylation motif, output its given access ID followed by a list of locations in the protein string where the motif can be found.
Sample Dataset
A2Z669
B5ZC00
P07204_TRBM_HUMAN
P20840_SAG1_YEAST
Sample Output
B5ZC00
85 118 142 306 395
P07204_TRBM_HUMAN
47 115 116 382 409
P20840_SAG1_YEAST
79 109 135 248 306 348 364 402 485 501 614
#coding=utf-8
import urllib2
import re
list = ['A2Z669','B5ZC00','P07204_TRBM_HUMAN','P20840_SAG1_YEAST'] for one in list:
name = one.strip('\n')
url = 'http://www.uniprot.org/uniprot/'+name+'.fasta'
req = urllib2.Request(url)
response = urllib2.urlopen(req)
the_page = response.read()
start = the_page.find('\nM')
seq = the_page[start+1:].replace('\n','')
seq = ' '+seq
regex = re.compile(r'N(?=[^P][ST][^P])')
index = 0
out = []
'''
out = [m.start() for m in re.finditer(regex, seq)]
''' index = 0
while(index<len(seq)):
index += 1 if re.search(regex,seq[index:]) == None:
break #print S[index:]
if re.match(regex,seq[index:]) != None:
out.append(index) if out != []:
print name
print ' '.join([ str(i) for i in out])
16 Finding a Protein Motif的更多相关文章
- 14 Finding a Shared Motif
Problem A common substring of a collection of strings is a substring of every member of the collecti ...
- 09 Finding a Motif in DNA
Problem Given two strings ss and tt, tt is a substring of ss if tt is contained as a contiguous coll ...
- ERROR: openstack Error finding address for http://10.16.37.215:9292/v1/images: [Errno 32] Broken pipe
Try to set: no_proxy=10.16.37.215 this should help 转自: http://askubuntu.com/questions/575938/error-i ...
- DNA motif 搜索算法总结
DNA motif 搜索算法总结 2011-09-15 ~ ADMIN 翻译自:A survey of DNA motif finding algorithms, Modan K Das et. al ...
- P6 EPPM Installation and Configuration Guide 16 R1 April 2016
P6 EPPM Installation and Configuration Guide 16 R1 April 2016 Contents About Installing and ...
- P6 Professional Installation and Configuration Guide (Microsoft SQL Server Database) 16 R1
P6 Professional Installation and Configuration Guide (Microsoft SQL Server Database) 16 R1 May ...
- LOJ Finding LCM(math)
1215 - Finding LCM Time Limit: 2 second(s) Memory Limit: 32 MB LCM is an abbreviation used for Least ...
- 越狱Season 1- Episode 16
Season 1, Episode 16 -Burrows:Don't be. It's not your fault. 不要,不是你的错 -Fernando: Know what I like? 知 ...
- 查找EBS中各种文件版本(Finding File Versions in the Oracle Applications EBusiness Suite - Checking the $HEADER)
Finding File Versions in the Oracle Applications EBusiness Suite - Checking the $HEADER (文档 ID 85895 ...
随机推荐
- 如何查看MySql的BLOB内容
一款Mysql的工具: SQLyog. 强项在于可以把blob的内容直接显示出来. 我觉得其实做产品能够活挺厉害,因为你做的东西确实为客户提供价值:在云云产品之中,能够让客户发现你并使用,购买你的产品 ...
- 【python】smtp邮件发送
纯文本: #!/usr/bin/env python3 #coding: utf-8 import smtplib from email.mime.text import MIMEText from ...
- flexible.js框架改写
前一阶段拜读了阿里团队的flexible.js,但是flexible的封装感觉还是不完美,因为flexible还是要依赖less/sass之类的编译执行,所以就存了一些问题,我把这些问题进行整理. 优 ...
- elasticsearch RESTful
一 .索引(index) 1. 创建索引 (1)第一种方式 PUT twitter { "settings" : { "index" : { "num ...
- Java 将字符串转换为字符数组 toCharArray()
Java 手册 toCharArray public char[] toCharArray() 将此字符串转换为一个新的字符数组. 返回: 一个新分配的字符数组,它的长度是此字符串的长度,它的内容被初 ...
- 汇编_指令_INC
加1指令 INC指令功能:目标操作数+1 INC指令只有1个操作数,它将指定的操作数的内容加1,再将结果送回到该操作数.INC指令将影响SF,AF,ZF,PF,OF标志位,但是不影响CF标志位. IN ...
- 将各种格式的数据转换成XML
public class DataToXml { /// <summary> /// 将DataTable对象转换成XML字符串 ...
- python 构造mysql爆破器
前言: 今天已经期末考完,睡了个觉起床写了个 mysql爆破器. 思路: 1.爆破用户->用户存在的话不会报错反之报错 2.爆破密码->密码正确不会报错反之报错 3.用户名和密码一起爆破- ...
- oracle9i-11.2安装包及补丁包下载链接
ORACLE 9i Oracle9i Database Release 2 Enterprise/Standard/Personal Edition for Windows NT/2000/XPhtt ...
- ES6系列_12之map数据结构
1.map数据结构出现的原因? JavaScript 的对象(Object),本质上是键值对的集合(Hash 结构),但是传统上只能用字符串当作键.这给它的使用带来了很大的限制.为了能实现将对象作为键 ...