Question

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return: ["AAAAACCCCC", "CCCCCAAAAA"].

Solution -- Bit Manipulation

Original idea is to use a set to store each substring. Time complexity is O(n) and space cost is O(n). But for details of space cost, a char is 2 bytes, so we need 20 bytes to store a substring and therefore (20n) space.

If we represent DNA substring by integer, the space is cut down to  (4n).

 public List<String> findRepeatedDnaSequences(String s) {
List<String> result = new ArrayList<String>(); int len = s.length();
if (len < 10) {
return result;
} Map<Character, Integer> map = new HashMap<Character, Integer>();
map.put('A', 0);
map.put('C', 1);
map.put('G', 2);
map.put('T', 3); Set<Integer> temp = new HashSet<Integer>();
Set<Integer> added = new HashSet<Integer>(); int hash = 0;
for (int i = 0; i < len; i++) {
if (i < 9) {
//each ACGT fit 2 bits, so left shift 2
hash = (hash << 2) + map.get(s.charAt(i));
} else {
hash = (hash << 2) + map.get(s.charAt(i));
//make length of hash to be 20
hash = hash & (1 << 20) - 1; if (temp.contains(hash) && !added.contains(hash)) {
result.add(s.substring(i - 9, i + 1));
added.add(hash); //track added
} else {
temp.add(hash);
}
} } return result;
}

Repeated DNA Sequences 解答的更多相关文章

  1. lc面试准备:Repeated DNA Sequences

    1 题目 All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: &quo ...

  2. LeetCode 187. 重复的DNA序列(Repeated DNA Sequences)

    187. 重复的DNA序列 187. Repeated DNA Sequences 题目描述 All DNA is composed of a series of nucleotides abbrev ...

  3. [LeetCode] Repeated DNA Sequences 求重复的DNA序列

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  4. [Leetcode] Repeated DNA Sequences

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  5. leetcode 187. Repeated DNA Sequences 求重复的DNA串 ---------- java

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  6. 【leetcode】Repeated DNA Sequences(middle)★

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  7. LeetCode() Repeated DNA Sequences 看的非常的过瘾!

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  8. Repeated DNA Sequences

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  9. Java for LeetCode 187 Repeated DNA Sequences

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

随机推荐

  1. Raw qcow qcow2 vhd-vpc虚拟磁盘格式间相互转换

  2. Direct3D 2D文本绘制

    现在学习下Direct3D在窗口中绘制一些文本信息,ID3DXFont接口负责创建字体和绘制二维的文本.我们介绍下ID3DXFont的用法. 1.创建LPD3DXFONT接口 LPD3DXFONT g ...

  3. Android系统匿名共享内存Ashmem(Anonymous Shared Memory)在进程间共享的原理分析

    文章转载至CSDN社区罗升阳的安卓之旅,原文地址:http://blog.csdn.net/luoshengyang/article/details/6666491 在前面一篇文章Android系统匿 ...

  4. DBSNMP和SYSMAN用户初始密码及正确的修改方式

    SYSMAN和DBSNMP跟涉及到Oracle的EM,所以跟其他的用户修改密码方式有所区别. 下面是这两个用户的默认密码和作用说明 DBSNMP DBSNMP The account used by ...

  5. linux学习记录 常用指令大全

    1.开启关闭服务器(即时生效): service iptasbles start service iptasbles stop 2.在开启了防火墙时,做如下设置,开启相关端口, 修改/etc/sysc ...

  6. docker虚拟化之将容器做成镜像

    1,docker ps -a 选择要启动的容器. 2,docker start  容器+ID 启动容器 3,docker exec -i -t   容器ID /bin/bash   进入容器 这里的/ ...

  7. EF搭建可扩展菜单

    EF实现可扩展性菜单 *:first-child { margin-top: 0 !important; } body>*:last-child { margin-bottom: 0 !impo ...

  8. 文摘:威胁建模(STRIDE方法)

    文摘,原文地址:https://msdn.microsoft.com/zh-cn/magazine/cc163519.aspx 威胁建模的本质:尽管通常我们无法证明给定的设计是安全的,但我们可以从自己 ...

  9. unity 打包 windows 运行 紫色 粉红色

    unity下建立了个小demo,在editer里面运行正常.如下 但是一旦打包发布到android或者windows下就出现了类似这种情况 这种一般是由于材质贴图的缺失,一般来说选定的默认贴图的话会打 ...

  10. javaScript增加样式规则(新增样式)

    <html> <head> <link rel="stylesheet" type="text/css" href="b ...