Repeated DNA Sequences

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].
 
用位图算法可以减少内存,代码如下:
int map_exist[ *  / ];
int map_pattern[ * / ]; #define set(map,x) \
(map[x >> ] |= ( << (x & 0x1F))) #define test(map,x) \
(map[x >> ] & ( << (x & 0x1F))) int dnamap[]; char** findRepeatedDnaSequences(char* s, int* returnSize) {
*returnSize = ;
if (s == NULL) return NULL;
int len = strlen(s);
if (len <= ) return NULL; memset(map_exist, , sizeof(int)* ( * / ));
memset(map_pattern, , sizeof(int)* ( * / )); dnamap['A' - 'A'] = ; dnamap['C' - 'A'] = ;
dnamap['G' - 'A'] = ; dnamap['T' - 'A'] = ; char ** ret = malloc(sizeof(char*));
int curr = ;
int size = ;
int key;
int i = ; while (i < )
key = (key << ) | dnamap[s[i++] - 'A'];
while (i < len){
key = ((key << ) & 0xFFFFF) | dnamap[s[i++] - 'A'];
if (test(map_pattern, key)){
if (!test(map_exist, key)){
set(map_exist, key);
if (curr == size){
size *= ;
ret = realloc(ret, sizeof(char*)* size);
}
ret[curr] = malloc(sizeof(char)* );
memcpy(ret[curr], &s[i-], );
ret[curr][] = '\0';
++curr;
} }
else{
set(map_pattern, key);
}
} ret = realloc(ret, sizeof(char*)* curr);
*returnSize = curr;
return ret;
}

该算法用时 6ms 左右, 非常快

 

LeetCode-Repeated DNA Sequences (位图算法减少内存)的更多相关文章

  1. [LeetCode] Repeated DNA Sequences 求重复的DNA序列

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  2. [LeetCode] Repeated DNA Sequences hash map

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  3. [Leetcode] Repeated DNA Sequences

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  4. LeetCode() Repeated DNA Sequences 看的非常的过瘾!

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  5. LeetCode 187. 重复的DNA序列(Repeated DNA Sequences)

    187. 重复的DNA序列 187. Repeated DNA Sequences 题目描述 All DNA is composed of a series of nucleotides abbrev ...

  6. lc面试准备:Repeated DNA Sequences

    1 题目 All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: &quo ...

  7. [LeetCode] 187. Repeated DNA Sequences 求重复的DNA序列

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  8. Leetcode:Repeated DNA Sequences详细题解

    题目 All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: " ...

  9. 【leetcode】Repeated DNA Sequences(middle)★

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

随机推荐

  1. ubuntu下sh文件使用

    可把shell命令批处理写进filename.sh文件 然后执行 chmod +x filename.sh 就可以执行./filename.sh了

  2. MySQL数据库索引的4大类型以及相关的索引创建

    以下的文章主要介绍的是MySQL数据库索引类型,其中包括普通索引,唯一索引,主键索引与主键索引,以及对这些索引的实际应用或是创建有一个详细介绍,以下就是文章的主要内容描述. (1)普通索引 这是最基本 ...

  3. 17.Python笔记之memcached&redis

    作者:刘耀 博客:www.liuyao.me 博客园:www.cnblogs.com/liu-yao 一.Memcached 1.介绍 Memcached 是一个高性能的分布式内存对象缓存系统,用于动 ...

  4. codeforces A. The Wall 解题报告

    题目链接:http://codeforces.com/problemset/problem/340/A 这道题目理解不难,就是在[a, b]区间内,找出同时能够被x和y整除的个数.第一次想当然的开了两 ...

  5. eclipse 优化提速

    1.windows–>perferences–>general–>startup and shutdown关掉没用的启动项: WTP :一个跟myeclipse差不多的东西,主要差别 ...

  6. jar包和war包的区别

    jar包和war包的区别: jar包就是别人已经写好的一些类,然后将这些类进行打包,你可以将这些jar包引入你的项目中,然后就可以直接使用这些jar包中的类和属性了,这些jar包一般都会放在lib目录 ...

  7. oracle 10g 学习之.NET使用Oracle数据库(14)

    因为使用System.Data.OracleClient会提示过时,推荐使用oracle自己提供的.net类库Oracle.DataAccess.Client 在oracle C:\oracle\pr ...

  8. 封装了get post方法

    function g($name, $defaultValue = "") { // php这里区分大小写,将两者都变为小写 $_GET = array_change_key_ca ...

  9. hdu 4267 多维树状数组

    题意:有一个序列 "1 a b k c" means adding c to each of Ai which satisfies a <= i <= b and (i ...

  10. 免费电子书:微软Azure基础之Azure Automation

    (此文章同时发表在本人微信公众号"dotNET每日精华文章") Azure Automation是Azure内置的一项自动化运维基础功能,微软为了让大家更快上手使用这项功能,特意推 ...