Problem

To allow for the presence of its varying forms, a protein motif is represented by a shorthand as follows: [XY] means "either X or Y" and {X} means "any amino acid except X." For example, the N-glycosylation motif is written as N{P}[ST]{P}.

You can see the complete description and features of a particular protein by its access ID "uniprot_id" in the UniProt database, by inserting the ID number into

http://www.uniprot.org/uniprot/uniprot_id

Alternatively, you can obtain a protein sequence in FASTA format by following

http://www.uniprot.org/uniprot/uniprot_id.fasta

For example, the data for protein B5ZC00 can be found at http://www.uniprot.org/uniprot/B5ZC00.

Given: At most 15 UniProt Protein Database access IDs.

Return: For each protein possessing the N-glycosylation motif, output its given access ID followed by a list of locations in the protein string where the motif can be found.

Sample Dataset

A2Z669
B5ZC00
P07204_TRBM_HUMAN
P20840_SAG1_YEAST

Sample Output

B5ZC00
85 118 142 306 395
P07204_TRBM_HUMAN
47 115 116 382 409
P20840_SAG1_YEAST
79 109 135 248 306 348 364 402 485 501 614
#coding=utf-8
import urllib2
import re
list = ['A2Z669','B5ZC00','P07204_TRBM_HUMAN','P20840_SAG1_YEAST'] for one in list:
name = one.strip('\n')
url = 'http://www.uniprot.org/uniprot/'+name+'.fasta'
req = urllib2.Request(url)
response = urllib2.urlopen(req)
the_page = response.read()
start = the_page.find('\nM')
seq = the_page[start+1:].replace('\n','')
seq = ' '+seq
regex = re.compile(r'N(?=[^P][ST][^P])')
index = 0
out = []
'''
out = [m.start() for m in re.finditer(regex, seq)]
''' index = 0
while(index<len(seq)):
index += 1 if re.search(regex,seq[index:]) == None:
break #print S[index:]
if re.match(regex,seq[index:]) != None:
out.append(index) if out != []:
print name
print ' '.join([ str(i) for i in out])

  

16 Finding a Protein Motif的更多相关文章

  1. 14 Finding a Shared Motif

    Problem A common substring of a collection of strings is a substring of every member of the collecti ...

  2. 09 Finding a Motif in DNA

    Problem Given two strings ss and tt, tt is a substring of ss if tt is contained as a contiguous coll ...

  3. ERROR: openstack Error finding address for http://10.16.37.215:9292/v1/images: [Errno 32] Broken pipe

    Try to set: no_proxy=10.16.37.215 this should help 转自: http://askubuntu.com/questions/575938/error-i ...

  4. DNA motif 搜索算法总结

    DNA motif 搜索算法总结 2011-09-15 ~ ADMIN 翻译自:A survey of DNA motif finding algorithms, Modan K Das et. al ...

  5. P6 EPPM Installation and Configuration Guide 16 R1 April 2016

    P6 EPPM Installation and Configuration Guide 16 R1         April 2016 Contents About Installing and ...

  6. P6 Professional Installation and Configuration Guide (Microsoft SQL Server Database) 16 R1

    P6 Professional Installation and Configuration Guide (Microsoft SQL Server Database) 16 R1       May ...

  7. LOJ Finding LCM(math)

    1215 - Finding LCM Time Limit: 2 second(s) Memory Limit: 32 MB LCM is an abbreviation used for Least ...

  8. 越狱Season 1- Episode 16

    Season 1, Episode 16 -Burrows:Don't be. It's not your fault. 不要,不是你的错 -Fernando: Know what I like? 知 ...

  9. 查找EBS中各种文件版本(Finding File Versions in the Oracle Applications EBusiness Suite - Checking the $HEADER)

    Finding File Versions in the Oracle Applications EBusiness Suite - Checking the $HEADER (文档 ID 85895 ...

随机推荐

  1. Struts2自定义标签3模仿原有的s:if s:elseif s:else自定义自己的if elsif else

    第一步:webroot/web-inf下简历str.tld文件 <?xml version="1.0" encoding="UTF-8"?> < ...

  2. Linux中的中断处理

    1. Linux中中断除了中断分层之外,还有一种就是中断线程化 存在意义:在Linux中,中断具有最高的优先级.不论在任何时刻,只要产生中断事件,内核将立即执行相应的中断处理程序,等到所有挂起的中断和 ...

  3. MySql登陆密码忘记了 怎么办?

    MySql登陆密码忘记了 怎么办?root密码:连root密码忘记没用root进修改mysql数据库user表咯 root密码: 方法一:MySQL提供跳访问控制命令行参数通命令行命令启MySQL服务 ...

  4. 【c#】设置Socket连接、接收超时(转)

    用到Socket,发现如果连接错误,比如Connect的端口不对,会造成很长时间的延时,程序就僵在那里,效果很不好: 在网上找到很方便的设置办法,分享如下: Socket.SetSocketOptio ...

  5. 【python】 使用 setuptools

    不会安装python的egg文件,在网上搜索了一下,被“蟒蛇蛋”这个词雷到了,记录下. 随着对python的逐渐使用,发现一些python组件是用一个包管理器发布的,今天搞了快一个小时,终于搞定了,这 ...

  6. emacs之配置8,gdb调试设置

    emacsConfig/gdb-setting.el (global-set-key [(f5)] 'gud-go) (global-set-key [(f7)] 'gud-step) (global ...

  7. [转]Jsp 与 JavaBean

    JavaBean 是一个遵循特定写法的 Java 类,它有以下特点: 1. Java 类具有一个无参的构造函数 2. 属性必须私有化. 3. 私有化的属性通过 public 类型的方法暴露给其它程序, ...

  8. ROS使用国内的DDNS服务

    未测试.转载余松老师的作品 虽然RouterOS 加入了cloud功能,但最近在配置RB2011的时候发现不好使,更新域名后无法正确解析到我的IP地址,虽然在cloud的public address中 ...

  9. OpenCL 矢量存取

    ▶ 函数 vloadn 和 vstoren 来实现全局存储器和局部存储器之间的向量拷贝 ● 代码 #include <stdio.h> #include <stdlib.h> ...

  10. maven工程 ,通过maven更新后,jre恢复到1.5的解决方法

    在maven setting.xml profiles节点下加入 <profile> <id>jdk-1.8</id> <activation> < ...