09 Finding a Motif in DNA
Problem
Given two strings ss and tt, tt is a substring of ss if tt is contained as a contiguous collection of symbols in ss (as a result, tt must be no longer than ss).
The position of a symbol in a string is the total number of symbols found to its left, including itself (e.g., the positions of all occurrences of 'U' in "AUGCUUCAGAAAGGUCUUACG" are 2, 5, 6, 15, 17, and 18). The symbol at position ii of ss is denoted by s[i]s[i].
A substring of ss can be represented as s[j:k]s[j:k], where jj and kk represent the starting and ending positions of the substring in ss; for example, if ss = "AUGCUUCAGAAAGGUCUUACG", then s[2:5]s[2:5] = "UGCU".
The location of a substring s[j:k]s[j:k] is its beginning position jj; note that tt will have multiple locations in ss if it occurs more than once as a substring of ss (see the Sample below).
Given: Two DNA strings ss and tt (each of length at most 1 kbp).
Return: All locations of tt as a substring of ss.
Sample Dataset
GATATATGCATATACTT
ATAT
Sample Output
2 4 10
#-*-coding:UTF-8-*-
### 9. Finding a Motif in DNA ### # Method 1: Use Module regex.finditer
import regex
# 比re更强大的模块 matches = regex.finditer('ATAT', 'GATATATGCATATACTT', overlapped=True)
# 返回所有匹配项,
for match in matches:
print (match.start() + 1) # Method 2: Brute Force Search
seq = 'GATATATGCATATACTT'
pattern = 'ATAT' def find_motif(seq, pattern):
position = []
for i in range(len(seq) - len(pattern)):
if seq[i:i + len(pattern)] == pattern:
position.append(str(i + 1)) print ('\t'.join(position)) find_motif(seq, pattern) # methond 3
import re
seq='GATATATGCATATACTT'
print [i.start()+1 for i in re.finditer('(?=ATAT)',seq)]
# ?= 之后字符串内容需要匹配表达式才能成功匹配。
09 Finding a Motif in DNA的更多相关文章
- Hash function
Hash function From Wikipedia, the free encyclopedia A hash function that maps names to integers fr ...
- Windows7WithSP1/TeamFoundationServer2012update4/SQLServer2012
[Info @09:03:33.737] ====================================================================[Info @ ...
- DNA motif 搜索算法总结
DNA motif 搜索算法总结 2011-09-15 ~ ADMIN 翻译自:A survey of DNA motif finding algorithms, Modan K Das et. al ...
- 14 Finding a Shared Motif
Problem A common substring of a collection of strings is a substring of every member of the collecti ...
- DNA binding motif比对算法
DNA binding motif比对算法 2012-08-31 ~ ADMIN 之前介绍了序列比对的一些算法.本节主要讲述motif(有人翻译成结构模式,但本文一律使用基模)的比对算法. 那么什么是 ...
- 16 Finding a Protein Motif
Problem To allow for the presence of its varying forms, a protein motif is represented by a shorthan ...
- hdu 1560 DNA sequence(搜索)
http://acm.hdu.edu.cn/showproblem.php?pid=1560 DNA sequence Time Limit: 15000/5000 MS (Java/Others) ...
- 查找EBS中各种文件版本(Finding File Versions in the Oracle Applications EBusiness Suite - Checking the $HEADER)
Finding File Versions in the Oracle Applications EBusiness Suite - Checking the $HEADER (文档 ID 85895 ...
- hdu 1560 DNA sequence(迭代加深搜索)
DNA sequence Time Limit : 15000/5000ms (Java/Other) Memory Limit : 32768/32768K (Java/Other) Total ...
随机推荐
- fackbook flow 简单使用
flow 是一个javascript 静态检查的工具,由facebook 开发, 使用起来简单,方便. 安装 项目初始化 yarn init -y 编译器安装 yarn add --dev babel ...
- 【C++11】新特性 之 auto的使用
C++11中引入的auto主要有两种用途:自己主动类型判断和返回值占位.auto在C++98中的标识暂时变量的语义,因为使用极少且多余.在C++11中已被删除.前后两个标准的auto,全然是两个概 ...
- eclipse 3.7 中英文自由切换
最近在学习Java的开发,然后又很多的资料是对于的英文环境讲解,有的资料是对应的中文环境讲解,所以很都对不上号,郁闷啊....... 而且开发的时候,每个人都使用习惯也不相同:有的人喜欢英文界面,有的 ...
- 首页大屏广告效果 jquery轮播图淡入淡出
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8&quo ...
- linux下nginx安装、配置实战
1什么是Nginx Nginx("enginex")是一个高性能的HTTP和反向代理服务器,也是一个IMAP/POP3/SMTP代理服务器,在高连接并发的情况下Nginx是Apac ...
- 把XML保存为ANSI编码
XmlDocument xmlDoc = new XmlDocument(); xmlDoc.LoadXml(xmlText); //plu.xml 编码是ANSI的.否则称上品名是乱码 XmlEle ...
- Think in java.chm 第14章 多线程
例子1引入线程概念通过得到当前线程方式循环主线程做某事 例子2演示了在主线程之外开启多个线程的基本方式 ( new一个extends Thread ) 例子3 ( task extends Threa ...
- word文档批量合并工具
#NoEnv ; Recommended for performance and compatibility with future AutoHotkey releases. ; #Warn ; En ...
- [转][html]大文件下载
上面代码来自微软,用于下载大文件. 下面代码来自 http://www.cnblogs.com/smile-wei/p/4159213.html System.IO.Stream iStream = ...
- postman-3断言
用例3A原则: arrange:初始化测试对象 act:操作.接口测试中通过不同的参数调用接口. assert:断言 snippets提供了一些断言的使用方法. 在新版的postman中,test中方 ...