Repeated DNA Sequences 解答
Question
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return: ["AAAAACCCCC", "CCCCCAAAAA"].
Solution -- Bit Manipulation
Original idea is to use a set to store each substring. Time complexity is O(n) and space cost is O(n). But for details of space cost, a char is 2 bytes, so we need 20 bytes to store a substring and therefore (20n) space.
If we represent DNA substring by integer, the space is cut down to (4n).
public List<String> findRepeatedDnaSequences(String s) {
List<String> result = new ArrayList<String>();
int len = s.length();
if (len < 10) {
return result;
}
Map<Character, Integer> map = new HashMap<Character, Integer>();
map.put('A', 0);
map.put('C', 1);
map.put('G', 2);
map.put('T', 3);
Set<Integer> temp = new HashSet<Integer>();
Set<Integer> added = new HashSet<Integer>();
int hash = 0;
for (int i = 0; i < len; i++) {
if (i < 9) {
//each ACGT fit 2 bits, so left shift 2
hash = (hash << 2) + map.get(s.charAt(i));
} else {
hash = (hash << 2) + map.get(s.charAt(i));
//make length of hash to be 20
hash = hash & (1 << 20) - 1;
if (temp.contains(hash) && !added.contains(hash)) {
result.add(s.substring(i - 9, i + 1));
added.add(hash); //track added
} else {
temp.add(hash);
}
}
}
return result;
}
Repeated DNA Sequences 解答的更多相关文章
- lc面试准备:Repeated DNA Sequences
1 题目 All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: &quo ...
- LeetCode 187. 重复的DNA序列(Repeated DNA Sequences)
187. 重复的DNA序列 187. Repeated DNA Sequences 题目描述 All DNA is composed of a series of nucleotides abbrev ...
- [LeetCode] Repeated DNA Sequences 求重复的DNA序列
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- [Leetcode] Repeated DNA Sequences
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- leetcode 187. Repeated DNA Sequences 求重复的DNA串 ---------- java
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- 【leetcode】Repeated DNA Sequences(middle)★
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- LeetCode() Repeated DNA Sequences 看的非常的过瘾!
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- Repeated DNA Sequences
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- Java for LeetCode 187 Repeated DNA Sequences
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
随机推荐
- Raw qcow qcow2 vhd-vpc虚拟磁盘格式间相互转换
- Direct3D 2D文本绘制
现在学习下Direct3D在窗口中绘制一些文本信息,ID3DXFont接口负责创建字体和绘制二维的文本.我们介绍下ID3DXFont的用法. 1.创建LPD3DXFONT接口 LPD3DXFONT g ...
- Android系统匿名共享内存Ashmem(Anonymous Shared Memory)在进程间共享的原理分析
文章转载至CSDN社区罗升阳的安卓之旅,原文地址:http://blog.csdn.net/luoshengyang/article/details/6666491 在前面一篇文章Android系统匿 ...
- DBSNMP和SYSMAN用户初始密码及正确的修改方式
SYSMAN和DBSNMP跟涉及到Oracle的EM,所以跟其他的用户修改密码方式有所区别. 下面是这两个用户的默认密码和作用说明 DBSNMP DBSNMP The account used by ...
- linux学习记录 常用指令大全
1.开启关闭服务器(即时生效): service iptasbles start service iptasbles stop 2.在开启了防火墙时,做如下设置,开启相关端口, 修改/etc/sysc ...
- docker虚拟化之将容器做成镜像
1,docker ps -a 选择要启动的容器. 2,docker start 容器+ID 启动容器 3,docker exec -i -t 容器ID /bin/bash 进入容器 这里的/ ...
- EF搭建可扩展菜单
EF实现可扩展性菜单 *:first-child { margin-top: 0 !important; } body>*:last-child { margin-bottom: 0 !impo ...
- 文摘:威胁建模(STRIDE方法)
文摘,原文地址:https://msdn.microsoft.com/zh-cn/magazine/cc163519.aspx 威胁建模的本质:尽管通常我们无法证明给定的设计是安全的,但我们可以从自己 ...
- unity 打包 windows 运行 紫色 粉红色
unity下建立了个小demo,在editer里面运行正常.如下 但是一旦打包发布到android或者windows下就出现了类似这种情况 这种一般是由于材质贴图的缺失,一般来说选定的默认贴图的话会打 ...
- javaScript增加样式规则(新增样式)
<html> <head> <link rel="stylesheet" type="text/css" href="b ...