Problem

Given two strings ss and tt, tt is a substring of ss if tt is contained as a contiguous collection of symbols in ss (as a result, tt must be no longer than ss).

The position of a symbol in a string is the total number of symbols found to its left, including itself (e.g., the positions of all occurrences of 'U' in "AUGCUUCAGAAAGGUCUUACG" are 2, 5, 6, 15, 17, and 18). The symbol at position ii of ss is denoted by s[i]s[i].

A substring of ss can be represented as s[j:k]s[j:k], where jj and kk represent the starting and ending positions of the substring in ss; for example, if ss = "AUGCUUCAGAAAGGUCUUACG", then s[2:5]s[2:5] = "UGCU".

The location of a substring s[j:k]s[j:k] is its beginning position jj; note that tt will have multiple locations in ss if it occurs more than once as a substring of ss (see the Sample below).

Given: Two DNA strings ss and tt (each of length at most 1 kbp).

Return: All locations of tt as a substring of ss.

Sample Dataset

GATATATGCATATACTT
ATAT

Sample Output

2 4 10

#-*-coding:UTF-8-*-
### 9. Finding a Motif in DNA ### # Method 1: Use Module regex.finditer
import regex
# 比re更强大的模块 matches = regex.finditer('ATAT', 'GATATATGCATATACTT', overlapped=True)
# 返回所有匹配项,
for match in matches:
print (match.start() + 1) # Method 2: Brute Force Search
seq = 'GATATATGCATATACTT'
pattern = 'ATAT' def find_motif(seq, pattern):
position = []
for i in range(len(seq) - len(pattern)):
if seq[i:i + len(pattern)] == pattern:
position.append(str(i + 1)) print ('\t'.join(position)) find_motif(seq, pattern) # methond 3
import re
seq='GATATATGCATATACTT'
print [i.start()+1 for i in re.finditer('(?=ATAT)',seq)]
# ?= 之后字符串内容需要匹配表达式才能成功匹配。

  


09 Finding a Motif in DNA的更多相关文章

  1. Hash function

    Hash function From Wikipedia, the free encyclopedia   A hash function that maps names to integers fr ...

  2. Windows7WithSP1/TeamFoundationServer2012update4/SQLServer2012

    [Info   @09:03:33.737] ====================================================================[Info   @ ...

  3. DNA motif 搜索算法总结

    DNA motif 搜索算法总结 2011-09-15 ~ ADMIN 翻译自:A survey of DNA motif finding algorithms, Modan K Das et. al ...

  4. 14 Finding a Shared Motif

    Problem A common substring of a collection of strings is a substring of every member of the collecti ...

  5. DNA binding motif比对算法

    DNA binding motif比对算法 2012-08-31 ~ ADMIN 之前介绍了序列比对的一些算法.本节主要讲述motif(有人翻译成结构模式,但本文一律使用基模)的比对算法. 那么什么是 ...

  6. 16 Finding a Protein Motif

    Problem To allow for the presence of its varying forms, a protein motif is represented by a shorthan ...

  7. hdu 1560 DNA sequence(搜索)

    http://acm.hdu.edu.cn/showproblem.php?pid=1560 DNA sequence Time Limit: 15000/5000 MS (Java/Others)  ...

  8. 查找EBS中各种文件版本(Finding File Versions in the Oracle Applications EBusiness Suite - Checking the $HEADER)

    Finding File Versions in the Oracle Applications EBusiness Suite - Checking the $HEADER (文档 ID 85895 ...

  9. hdu 1560 DNA sequence(迭代加深搜索)

    DNA sequence Time Limit : 15000/5000ms (Java/Other)   Memory Limit : 32768/32768K (Java/Other) Total ...

随机推荐

  1. Android开发入门

    教我徒弟Android开发入门(一) 教我徒弟Android开发入门(二) 教我徒弟Android开发入门(三) 出处:http://www.cnblogs.com/kexing/tag/Androi ...

  2. C语言多线程pthread库相关函数说明

    线程相关操作说明 一 pthread_t pthread_t在头文件/usr/include/bits/pthreadtypes.h中定义: typedef unsigned long int pth ...

  3. win10 下ie11安装flash debuger (install flashplayer debuger on win10 64bit)

    1不能安装的现象 由于win10  ie11  内置flash  微软不让用户自己手动更新ie11的flash以及安装flash  debugger  ,这怕是让还在用 flex 开发的大胸弟们很头疼 ...

  4. 排列算法(reverse...rotate...next_permutation)

    #include <iostream> #include <algorithm> #include <cstring> using namespace std; i ...

  5. Windows 2008 关闭远程桌面的单用户多会话模式

    Windows 2008 关闭远程桌面的单用户多会话模式 在腾讯云上购买了一台云服务器. 因为设置了自动登录,在远程桌面连接后会启动一个新的会话,然后软件被运行了两次,端口被占用,无法起动. 还有可能 ...

  6. WebKit的已实施srcset图像响应属性

    WebKit已经发布了一些官方新闻,终于落实srcset的属性.作为W3C的响应图像社区组的主席,我一直希望这一刻到来有一段时间了.所以,对所有参与方是个好消息,用户浏览网页时的体验是最重要的. 所有 ...

  7. php的静态变量的实现

    1.静态变量的结构 php脚本编译之后会生成执行opcode组成的opcode_array,执行每个zend_op_array都会生成一个单独的zend_execute_data结构. php的局部变 ...

  8. SQL 知识及用法备忘录

    ---查询当前数据库一共有多少张表 ) from sysobjects where xtype='U' ---查询当前数据库有多少张视图 ) from sysobjects where xtype=' ...

  9. [html][javascript] Cookie

    更多可参考:http://www.cnblogs.com/newsouls/archive/2012/11/12/2766567.html // 读 cookie 方法 function getCoo ...

  10. 用shp制作geoJson格式地图数据(shp convert to geoJson)

    本文紧接前文,简单说明利用shp数据制作Echarts支持的geoJson格式的地图数据.本文以北京市通州区各镇的shp数据为例进行说明. 软件环境: ArcGIS 10.2 (ArcGIS 10.2 ...